Skip to main content
Jeden Skill in Manus ausführen
mit einem Klick

writing-bench-task-judge

// Use when writing or modifying `check_goals()` / `get_answer()` / App `check_*` methods in `bench_env/task/`, or when reviewing a draft task's judge correctness. Triggers include adding a new task, editing a judge method, or diagnosing a judge false-positive/negative.

$ git log --oneline --stat
stars:329
forks:54
updated:26. Mai 2026 um 04:06
SKILL.md
readonly