| name | medkit-attending-debrief |
| description | Reference for the DEBRIEF MODE that lives inside the `medkit-attending` Managed Agent. Use this skill when debugging why the agent emitted a particular `render_case_evaluation` payload, when modifying the rubric scoring logic, or when a citation appears unresolved in the UI. NOT used to author new rubrics — see medkit-rubric-author for that. |
medkit-attending — DEBRIEF MODE reference
The medkit-attending Managed Agent (Opus 4.7) runs in two modes during a session:
- Live mode — observes the encounter, optionally emits
render_triage_badge
on ER arrivals.
- DEBRIEF MODE — kicks in when the trainee submits a
[debrief request]
message at end of encounter; emits exactly one render_case_evaluation and
stops.
This skill documents mode 2. The full system prompt is in
backend/server.py under MEDKIT_ATTENDING_SYSTEM_PROMPT.
Trigger contract
The trainee's frontend posts a user.message whose body starts with the
literal header [debrief request], followed by a JSON block produced by
buildDebriefRequest(). The block
contains:
case_id and case_summary (chief complaint, correct diagnosis, severity)
rubric — full CaseRubric object (data_gathering, clinical_management,
interpersonal, optional safety_netting)
registry_slice — only the guidelines + recommendations cited by the rubric;
acts as both context AND an allowlist
encounter_log — chronological history Q&A, ordered tests with timestamps,
treatments given, prescriptions, submitted diagnosis, correctness flag
Output contract
A single render_case_evaluation tool use whose input validates against
caseEvaluationInput (Zod schema mirrors
the JSON schema in backend/server.py:MEDKIT_CUSTOM_TOOLS). Required fields:
case_id, global_rating, domain_scores, criteria, highlights, improvements, narrative
Optional: safety_breach (object or null).
Hard rules baked into the agent prompt
- Cite, don't invent. Every
clinical_management criterion's
guideline_ref must appear in registry_slice.recommendations[].recId.
If no rec applies, the agent drops the criterion. Never fabricates.
- Specific evidence. "You missed ICE" is not enough. The expected bar is
a transcript-quoted observation tied to the case (e.g. patient hinted at
father's stroke, trainee didn't pick it up).
- Safety first. A contraindicated drug, missed red-flag escalation, or
no safety-netting on a high-risk diagnosis sets
safety_breach and the
narrative leads with it regardless of total score.
- Bands for
verdict (per domain and global): ≥0.85 excellent, ≥0.70 good,
≥0.55 satisfactory, ≥0.40 borderline, otherwise clear-fail.
- No clinical advice for real patients. Output framed as training only.
Files in the chain
Deploying changes to the agent
The agent definition lives on Anthropic's platform. After editing the system
prompt or tool schema in backend/server.py:
- Restart the FastAPI backend (so the Python module re-loads the constants).
curl -X POST http://127.0.0.1:8787/agent/refresh -H "Origin: http://localhost:5173".
- The response shows the new agent version. Existing sessions keep their
pinned version; new sessions pick up the latest.
If the schema or system prompt drifts between code and platform, the agent
might emit shapes the Zod parser rejects. The useAttendingDebrief hook
returns a validation error in that case; check the browser console.
Smoke tests
Run these whenever you touch the rubric schema, the registry, or the agent
prompt.
Anti-patterns
- Editing the Zod schema without updating the matching JSON schema in
backend/server.py (or vice versa) — they MUST match.
- Letting the agent emit
render_case_grade (the legacy flat-score tool) — it
was deprecated when DEBRIEF MODE shipped; the system prompt should never
mention it.
- Treating an unresolved citation in the UI as an agent bug. It usually means
the rubric cites a
recId that isn't in the registry: either author the
recommendation in guidelines.ts (via medkit-guideline-curator) or remove the
citation from the rubric.