mit einem Klick
create-fixture
// Create a new eval fixture for whoami.wiki from the user's personal archive data
// Create a new eval fixture for whoami.wiki from the user's personal archive data
| name | create-fixture |
| description | Create a new eval fixture for whoami.wiki from the user's personal archive data |
| triggers | ["create-fixture","new fixture","add fixture"] |
| user_invocable | true |
Interactively create a new eval fixture for the whoami.wiki evaluation suite.
/create-fixture [page-type]
Examples:
/create-fixture — start interactive fixture creation/create-fixture person — create a person page fixture/create-fixture episode — create an episode page fixture/create-fixture project — create a project page fixtureIf no page type was provided as an argument, ask the user:
What type of page is this fixture for?
- Person — a biography of someone in your archive (friend, family, colleague)
- Episode — a specific event, trip, or milestone
- Project — a software project, creative work, or collaborative effort
Ask the user for:
incremental)Ask the user what archive data they have available for this subject. For each source, collect:
instagram, whatsapp, messages, photos, location, transactions, shazam, uber_trips, github, slackGuide them based on page type:
Based on the page type and available sources, design the checkpoint sequence. Use the examples in evals/fixtures/examples/ as templates.
Standard checkpoint patterns by page type:
Person:
survey — Snapshot first source, create source pagedraft — Write initial person page from first source (skipReference: true)new-source — Add remaining sources, revise pageepisodes — Create episode sub-pages for rich eventsowner-input — Integrate owner testimony (if anecdotes provided)verify — Final review + citation manifestEpisode:
survey — Snapshot photos/location, create source pagesdraft — Write day-by-day itinerary from spatial data (skipReference: true)new-source — Add messages/transactions, weave into narrativepersons — Create person stubs for trip participantsowner-input — Integrate owner memories (if anecdotes provided)verify — Final review + citation manifestProject:
survey — Snapshot git repo, create source pagedraft — Write project page from code/commits (skipReference: true)new-source — Add Slack/messages, integrate collaboration contextepisodes — Create episode pages for key development momentsverify — Final review + citation manifestFor each checkpoint, set appropriate grade targets using the subject name and roles. Set threshold: 0.3 on the survey checkpoint.
Ask:
Do you have personal memories or corrections about this subject that you'd like the agent to incorporate? These are things the digital sources can't capture — personal stories, context, corrections to what the data shows.
If yes, collect entries interactively. For each entry ask:
Write these to owner-anecdotes.json in the fixture directory.
Ask:
Do you want to write gold-standard reference pages for grading? These are optional — they let the reference grader compare the agent's output against an ideal version.
You can:
- Skip for now — the other graders (completeness, citations, editorial) still work without references
- Write them later — run the eval once, review the agent's output, then refine it into a reference
- Write them now — I'll help you draft reference pages following the editorial guide
If they want to write references now, help them draft wikitext pages following the editorial guide conventions. Save to the references/ subdirectory and add entries to the references map in case.json.
Determine the next available case number by listing existing directories in evals/fixtures/<suite>/:
ls evals/fixtures/<suite>/
Use the next sequential number with zero-padding (e.g., 004-person, 005-trip).
Create the fixture directory and write all files:
evals/fixtures/<suite>/<NNN-type>/
├── case.json
├── owner-anecdotes.json (if anecdotes were provided)
└── references/
├── <subject>.wiki (if reference pages were written)
└── talk-<subject>.wiki (if talk reference was written)
After writing the files:
case.json and verify it parses correctlycase.json exist on the user's machinereferences and ownerInput exist in the fixture directoryevals/src/types.tsPrint a summary:
Created fixture: evals/fixtures/<suite>/<case-id>/
Page type: Person
Subject: Sarah Kim
Sources: instagram, whatsapp
Checkpoints: survey → draft → new-source → episodes → owner-input → verify
References: yes/no
Owner anecdotes: 4 entries
Run it with:
cd evals && pnpm eval --suite <suite> --case <case-id> --harness claude-code
fixtures/incremental/ are gitignored — personal data stays localsnapshotId field in sources starts empty and is populated at runtime by wai snapshotcase.json are relative to the fixture directoryslug-case for reference filenames (e.g., alex-chen.wiki, talk-alex-chen.wiki)