with one click
create-fixture
// Create a new eval fixture for whoami.wiki from the user's personal archive data
// Create a new eval fixture for whoami.wiki from the user's personal archive data
| name | create-fixture |
| description | Create a new eval fixture for whoami.wiki from the user's personal archive data |
| triggers | ["create-fixture","new fixture","add fixture"] |
| user_invocable | true |
Interactively create a new eval fixture for the whoami.wiki evaluation suite.
/create-fixture [page-type]
Examples:
/create-fixture ā start interactive fixture creation/create-fixture person ā create a person page fixture/create-fixture episode ā create an episode page fixture/create-fixture project ā create a project page fixtureIf no page type was provided as an argument, ask the user:
What type of page is this fixture for?
- Person ā a biography of someone in your archive (friend, family, colleague)
- Episode ā a specific event, trip, or milestone
- Project ā a software project, creative work, or collaborative effort
Ask the user for:
incremental)Ask the user what archive data they have available for this subject. For each source, collect:
instagram, whatsapp, messages, photos, location, transactions, shazam, uber_trips, github, slackGuide them based on page type:
Based on the page type and available sources, design the checkpoint sequence. Use the examples in evals/fixtures/examples/ as templates.
Standard checkpoint patterns by page type:
Person:
survey ā Snapshot first source, create source pagedraft ā Write initial person page from first source (skipReference: true)new-source ā Add remaining sources, revise pageepisodes ā Create episode sub-pages for rich eventsowner-input ā Integrate owner testimony (if anecdotes provided)verify ā Final review + citation manifestEpisode:
survey ā Snapshot photos/location, create source pagesdraft ā Write day-by-day itinerary from spatial data (skipReference: true)new-source ā Add messages/transactions, weave into narrativepersons ā Create person stubs for trip participantsowner-input ā Integrate owner memories (if anecdotes provided)verify ā Final review + citation manifestProject:
survey ā Snapshot git repo, create source pagedraft ā Write project page from code/commits (skipReference: true)new-source ā Add Slack/messages, integrate collaboration contextepisodes ā Create episode pages for key development momentsverify ā Final review + citation manifestFor each checkpoint, set appropriate grade targets using the subject name and roles. Set threshold: 0.3 on the survey checkpoint.
Ask:
Do you have personal memories or corrections about this subject that you'd like the agent to incorporate? These are things the digital sources can't capture ā personal stories, context, corrections to what the data shows.
If yes, collect entries interactively. For each entry ask:
Write these to owner-anecdotes.json in the fixture directory.
Ask:
Do you want to write gold-standard reference pages for grading? These are optional ā they let the reference grader compare the agent's output against an ideal version.
You can:
- Skip for now ā the other graders (completeness, citations, editorial) still work without references
- Write them later ā run the eval once, review the agent's output, then refine it into a reference
- Write them now ā I'll help you draft reference pages following the editorial guide
If they want to write references now, help them draft wikitext pages following the editorial guide conventions. Save to the references/ subdirectory and add entries to the references map in case.json.
Determine the next available case number by listing existing directories in evals/fixtures/<suite>/:
ls evals/fixtures/<suite>/
Use the next sequential number with zero-padding (e.g., 004-person, 005-trip).
Create the fixture directory and write all files:
evals/fixtures/<suite>/<NNN-type>/
āāā case.json
āāā owner-anecdotes.json (if anecdotes were provided)
āāā references/
āāā <subject>.wiki (if reference pages were written)
āāā talk-<subject>.wiki (if talk reference was written)
After writing the files:
case.json and verify it parses correctlycase.json exist on the user's machinereferences and ownerInput exist in the fixture directoryevals/src/types.tsPrint a summary:
Created fixture: evals/fixtures/<suite>/<case-id>/
Page type: Person
Subject: Sarah Kim
Sources: instagram, whatsapp
Checkpoints: survey ā draft ā new-source ā episodes ā owner-input ā verify
References: yes/no
Owner anecdotes: 4 entries
Run it with:
cd evals && pnpm eval --suite <suite> --case <case-id> --harness claude-code
fixtures/incremental/ are gitignored ā personal data stays localsnapshotId field in sources starts empty and is populated at runtime by wai snapshotcase.json are relative to the fixture directoryslug-case for reference filenames (e.g., alex-chen.wiki, talk-alex-chen.wiki)