Run any Skill in Manus with one click

$pwd:

medkit-rubric-author

Name: Medkit Rubric Author
Author: bedriyan

// Author a `CaseRubric` for an existing `PatientCase` with PLAB2/RCGP/NURSE/SEGUE framework tags and citation discipline. Use whenever a case needs to graduate from the auto-derived fallback rubric to a hand-authored one for hero-level grading. Outputs an in-place edit of `src/data/polyclinicPatients.ts` (or `src/data/patients.ts`) inserting the `rubric: {...}` field. Every `clinical_management` criterion's `guideline_ref` MUST resolve in `src/data/guidelines.ts` — the rule is `cite, don't invent`.

Run Skill in Manus

$ git log --oneline --stat

stars:272

forks:56

updated:April 26, 2026 at 00:40

File Explorer

2 files

SKILL.md

readonly

name

medkit-rubric-author

description

Author a `CaseRubric` for an existing `PatientCase` with PLAB2/RCGP/NURSE/SEGUE framework tags and citation discipline. Use whenever a case needs to graduate from the auto-derived fallback rubric to a hand-authored one for hero-level grading. Outputs an in-place edit of `src/data/polyclinicPatients.ts` (or `src/data/patients.ts`) inserting the `rubric: {...}` field. Every `clinical_management` criterion's `guideline_ref` MUST resolve in `src/data/guidelines.ts` — the rule is `cite, don't invent`.

Rubric author

You author OSCE-style rubrics for the simulator's hero cases. The medkit-attending agent reads these at debrief time and grades the trainee's encounter against them. A well-authored rubric is case-specific, framework-tagged, and fully cited — generic rubrics produce generic feedback that the trainee tunes out.

Inputs you need

A case_id that already exists in src/data/polyclinicPatients.ts or src/data/patients.ts.
Optional: a rubric_brief shaped per rubric-brief.schema.json — lets the caller bias the rubric (e.g. "weight history-taking heavily, this case is about red-flag pickup").

Hard rules — non-negotiable

Cite, don't invent. Every clinical_management criterion's guideline_ref MUST be of the form <guideline_id>:<rec_id> and MUST resolve in src/data/guidelines.ts via getRecommendation(). If you want to score something that has no matching rec, drop the criterion. Do NOT fabricate a citation. If the registry is missing a relevant rec, escalate to medkit-guideline-curator first.
Specific evidence. Every criterion's evidence field describes exactly what counts as "met". Not "asks about symptoms" — instead "Asks specifically about target-organ symptoms — chest pain, breathlessness, visual changes, leg swelling". The agent uses this string to match transcript turns or actions, so be unambiguous.
Match the case. Pull on the specific clinical features visible in the case. If the case includes a planted family-history hint, reward eliciting it. If the case has LVH on ECG, reward acting on target-organ damage.
Three domains plus optional safety_netting.
- data_gathering: 5–8 criteria (history depth, examination scope, ICE, red-flag screen, lifestyle hx, secondary-cause screen).
- clinical_management: 4–6 criteria (severity stratification, first-line management, escalation / target, follow-up plan).
- interpersonal: 3–5 criteria (opening / agenda-setting, empathy, jargon-free explanation, shared decision, summarising).
- safety_netting (optional but recommended): required_elements: string[]
  - weight: number (usually 2) + optional guideline_ref.
- global_rating: 'borderline-regression' (always set; reserved for future strict-scoring modes).
Frameworks. Tag each criterion with framework from 'PLAB2' | 'RCGP' | 'NURSE' | 'SEGUE' | 'ICE' | 'SOCRATES' | 'OS-12'. Use them honestly: NURSE for empathy, ICE for ideas/concerns/expectations, SOCRATES for symptom characterisation (won't apply to asymptomatic findings).
Weights. 1 = nice-to-have, 2 = expected, 3 = critical. Calibrate so a competent doctor scores ~70% and a strong one ~90%. Don't make every criterion weight 3 — the verdict bands lose meaning.
Stable criterion ids. Short ids — dg-01, dg-02, … cm-01, … ip-01, … per case. Never reuse an id across domains.

Type contract

The rubric must validate against CaseRubric in src/game/types.ts:

rubric: {
  data_gathering: RubricCriterion[];
  clinical_management: RubricCriterion[];   // every entry has guideline_ref
  interpersonal: RubricCriterion[];
  safety_netting?: { required_elements: string[]; weight: number; guideline_ref?: string };
  global_rating: 'borderline-regression';
}

type RubricCriterion = {
  criterion_id: string;
  label: string;
  weight: 1 | 2 | 3;
  framework?: 'PLAB2' | 'RCGP' | 'NURSE' | 'SEGUE' | 'ICE' | 'SOCRATES' | 'OS-12';
  guideline_ref?: string;        // 'guideline_id:rec_id'
  evidence: string;
};

Process

Read the case from polyclinicPatients.ts / patients.ts. Note the correctDiagnosisId, criticalTreatmentIds, anamnesis (especially the relevant: true items), testResults (look for abnormal findings the trainee should act on), chiefComplaint.
Read src/data/guidelines.ts and list every recId available for the relevant condition. Treat that list as your allowlist for clinical_management.guideline_ref.
Draft the rubric. Map each available rec to a clinical_management criterion when the case actually exercises it. Drop any rec that doesn't apply (e.g. CCB-first-line is wrong for an under-55 white patient — don't cite it just because it's in the registry).
Add data_gathering criteria from the case's anamnesis + the standard GP work-up (ICE, red-flag screen, secondary-cause screen, lifestyle).
Add interpersonal criteria from the patient's persona: an asymptomatic patient saying "I feel fine" warrants a NURSE-style empathy criterion; a patient with a planted family-history hint warrants an empathic-pickup criterion.
Insert the rubric in-place in the case object, directly after criticalTreatmentIds, before the closing }. Match the existing 4-space indentation. Do not move or rename other fields.
After writing, run:
```
node node_modules/typescript/bin/tsc --noEmit
node scripts/verify/rubric-smoke.ts
```
The smoke test confirms every cited guideline_ref resolves and the shape validates.

Reporting

Return a short report:

Per domain: criterion count + total weighted points.
Which recIds you cited.
Which recs you considered but intentionally dropped, and why.
Any framework gap noticed (e.g. "no NURSE example because the patient is stoic and never expresses emotion").
Whether you reviewed downstream usage with the live debrief test (scripts/verify/live-debrief.ts).

Anti-patterns

Citing a recId that doesn't exist in the registry (silently breaks the cite-card render — the UI will show "unresolved cite").
Generic evidence ("asks about symptoms") — the agent then can't ground its verdict in a specific transcript moment.
Same weight on every criterion — collapses the verdict bands.
More than ~20 criteria total — the trainee tunes out and the debrief becomes noise.
Missing safety_netting on a case that has a real escalation pathway (chest pain, severe BP, suspected sepsis).
Editing correctDiagnosisId / criticalTreatmentIds while authoring the rubric — that's medkit-patient-generator's job.

related-skills.json

same repository

attending-debrief.md

from "bedriyan/medkit-app"

Scores a completed OSCE encounter against the case rubric and produces a structured debrief. Cites guideline IDs from registry. Never invents recommendation classes or LoE.

2026-04-26272

case-generator.md

from "bedriyan/medkit-app"

Generates OSCE-style GP case JSON variants from an authoritative guideline + a variant brief. Outputs strict JSON conforming to cases/case.schema.json. Every clinical fact and rubric item is traceable back to a registry guideline ID.

2026-04-26272

guideline-curator.md

from "bedriyan/medkit-app"

Drafts entries for guidelines/registry.json by fetching authoritative society/agency guidelines via WebFetch, restricted to the source whitelist. Output is always verificationStatus="auto-fetched" — only the MD signs off "verified". Designed to run on a weekly /loop to keep the registry current.

2026-04-26272

patient-roleplay.md

from "bedriyan/medkit-app"

Voices the patient in real time during a MedKit OSCE encounter. Reads hidden.history_facts + personality + planted_cues + lies. Outputs naturalistic spoken utterances; never breaks character; never reveals more than a real standardised patient would.

2026-04-26272

simulation-tick.md

from "bedriyan/medkit-app"

World physics for MedKit. For every student action (order test, prescribe, refer, examine, advise) returns strict JSON describing validity, what the student sees, time cost, and any change to patient state. Never speaks to the student.

2026-04-26272

medkit-attending-debrief.md

from "bedriyan/medkit-app"

Reference for the DEBRIEF MODE that lives inside the `medkit-attending` Managed Agent. Use this skill when debugging why the agent emitted a particular `render_case_evaluation` payload, when modifying the rubric scoring logic, or when a citation appears unresolved in the UI. NOT used to author new rubrics — see medkit-rubric-author for that.

2026-04-26272

package.json

"author": "bedriyan"

"repository": "bedriyan/medkit-app"

View GitHub Repository View Creator Repositories

$ install --global

$ download --local

Run Skill in Manus

$ useful --forSOC

Software DevelopersComputer and Mathematical Occupations15-1252L4

name

medkit-rubric-author

description

Rubric author

Inputs you need

A case_id that already exists in src/data/polyclinicPatients.ts or src/data/patients.ts.
Optional: a rubric_brief shaped per rubric-brief.schema.json — lets the caller bias the rubric (e.g. "weight history-taking heavily, this case is about red-flag pickup").

Hard rules — non-negotiable

Cite, don't invent. Every clinical_management criterion's guideline_ref MUST be of the form <guideline_id>:<rec_id> and MUST resolve in src/data/guidelines.ts via getRecommendation(). If you want to score something that has no matching rec, drop the criterion. Do NOT fabricate a citation. If the registry is missing a relevant rec, escalate to medkit-guideline-curator first.
Specific evidence. Every criterion's evidence field describes exactly what counts as "met". Not "asks about symptoms" — instead "Asks specifically about target-organ symptoms — chest pain, breathlessness, visual changes, leg swelling". The agent uses this string to match transcript turns or actions, so be unambiguous.
Match the case. Pull on the specific clinical features visible in the case. If the case includes a planted family-history hint, reward eliciting it. If the case has LVH on ECG, reward acting on target-organ damage.
Three domains plus optional safety_netting.
- data_gathering: 5–8 criteria (history depth, examination scope, ICE, red-flag screen, lifestyle hx, secondary-cause screen).
- clinical_management: 4–6 criteria (severity stratification, first-line management, escalation / target, follow-up plan).
- interpersonal: 3–5 criteria (opening / agenda-setting, empathy, jargon-free explanation, shared decision, summarising).
- safety_netting (optional but recommended): required_elements: string[]
  - weight: number (usually 2) + optional guideline_ref.
- global_rating: 'borderline-regression' (always set; reserved for future strict-scoring modes).
Frameworks. Tag each criterion with framework from 'PLAB2' | 'RCGP' | 'NURSE' | 'SEGUE' | 'ICE' | 'SOCRATES' | 'OS-12'. Use them honestly: NURSE for empathy, ICE for ideas/concerns/expectations, SOCRATES for symptom characterisation (won't apply to asymptomatic findings).
Weights. 1 = nice-to-have, 2 = expected, 3 = critical. Calibrate so a competent doctor scores ~70% and a strong one ~90%. Don't make every criterion weight 3 — the verdict bands lose meaning.
Stable criterion ids. Short ids — dg-01, dg-02, … cm-01, … ip-01, … per case. Never reuse an id across domains.

Type contract

The rubric must validate against CaseRubric in src/game/types.ts:

rubric: {
  data_gathering: RubricCriterion[];
  clinical_management: RubricCriterion[];   // every entry has guideline_ref
  interpersonal: RubricCriterion[];
  safety_netting?: { required_elements: string[]; weight: number; guideline_ref?: string };
  global_rating: 'borderline-regression';
}

type RubricCriterion = {
  criterion_id: string;
  label: string;
  weight: 1 | 2 | 3;
  framework?: 'PLAB2' | 'RCGP' | 'NURSE' | 'SEGUE' | 'ICE' | 'SOCRATES' | 'OS-12';
  guideline_ref?: string;        // 'guideline_id:rec_id'
  evidence: string;
};

Process

Read the case from polyclinicPatients.ts / patients.ts. Note the correctDiagnosisId, criticalTreatmentIds, anamnesis (especially the relevant: true items), testResults (look for abnormal findings the trainee should act on), chiefComplaint.
Read src/data/guidelines.ts and list every recId available for the relevant condition. Treat that list as your allowlist for clinical_management.guideline_ref.
Draft the rubric. Map each available rec to a clinical_management criterion when the case actually exercises it. Drop any rec that doesn't apply (e.g. CCB-first-line is wrong for an under-55 white patient — don't cite it just because it's in the registry).
Add data_gathering criteria from the case's anamnesis + the standard GP work-up (ICE, red-flag screen, secondary-cause screen, lifestyle).
Add interpersonal criteria from the patient's persona: an asymptomatic patient saying "I feel fine" warrants a NURSE-style empathy criterion; a patient with a planted family-history hint warrants an empathic-pickup criterion.
Insert the rubric in-place in the case object, directly after criticalTreatmentIds, before the closing }. Match the existing 4-space indentation. Do not move or rename other fields.
After writing, run:
```
node node_modules/typescript/bin/tsc --noEmit
node scripts/verify/rubric-smoke.ts
```
The smoke test confirms every cited guideline_ref resolves and the shape validates.

Reporting

Return a short report:

Per domain: criterion count + total weighted points.
Which recIds you cited.
Which recs you considered but intentionally dropped, and why.
Any framework gap noticed (e.g. "no NURSE example because the patient is stoic and never expresses emotion").
Whether you reviewed downstream usage with the live debrief test (scripts/verify/live-debrief.ts).

Anti-patterns

Citing a recId that doesn't exist in the registry (silently breaks the cite-card render — the UI will show "unresolved cite").
Generic evidence ("asks about symptoms") — the agent then can't ground its verdict in a specific transcript moment.
Same weight on every criterion — collapses the verdict bands.
More than ~20 criteria total — the trainee tunes out and the debrief becomes noise.
Missing safety_netting on a case that has a real escalation pathway (chest pain, severe BP, suspected sepsis).
Editing correctDiagnosisId / criticalTreatmentIds while authoring the rubric — that's medkit-patient-generator's job.

medkit-rubric-author

Rubric author

Inputs you need

Hard rules — non-negotiable

Type contract

Process

Reporting

Anti-patterns

More from this repository

More from this repository

Rubric author

Inputs you need

Hard rules — non-negotiable

Type contract

Process

Reporting

Anti-patterns