name	propose-judge-patch
description	Drafts a reviewable judge-template patch from evaluator validation disagreements.

Propose Judge Patch

Use this after skills/validate-evaluator.md has produced a JSON validation report with disagreements.

Workflow

Run evaluator validation and save JSON:

node scripts/validate-evaluator.mjs labels/sample-goldens.json --json > reports/evaluator-validation.json

Propose a judge-template patch:

node scripts/propose-judge-patch.mjs reports/evaluator-validation.json --judge-template rag-quality --output reports/proposed-judge.patch

Inspect the patch before applying it.

The script is deterministic. It does not call a model and does not edit judge templates directly. It reads disagreement reasons and human critiques, infers likely rubric gaps, and writes a patch file for human or agent review.

Guardrails

Do not patch a judge from one or two weak examples unless the failure mode is obvious.
Prefer adding concrete pass/fail clauses over rewriting the whole prompt.
Re-run scripts/validate-evaluator.mjs after applying any judge-template change.

name	propose-judge-patch
description	Drafts a reviewable judge-template patch from evaluator validation disagreements.

Propose Judge Patch

Use this after skills/validate-evaluator.md has produced a JSON validation report with disagreements.

Workflow

Run evaluator validation and save JSON:

node scripts/validate-evaluator.mjs labels/sample-goldens.json --json > reports/evaluator-validation.json

Propose a judge-template patch:

node scripts/propose-judge-patch.mjs reports/evaluator-validation.json --judge-template rag-quality --output reports/proposed-judge.patch

Inspect the patch before applying it.

Guardrails

Do not patch a judge from one or two weak examples unless the failure mode is obvious.
Prefer adding concrete pass/fail clauses over rewriting the whole prompt.
Re-run scripts/validate-evaluator.mjs after applying any judge-template change.

propose-judge-patch

Propose Judge Patch

Workflow

Guardrails

이 저장소의 다른 Skills

이 저장소의 다른 Skills

Propose Judge Patch

Workflow

Guardrails