Skip to main content
Jeden Skill in Manus ausführen
mit einem Klick

task-review

// SkillsBench task PR review — classifies the task track (standard / research / multimodal), runs static policy checks against the track-specific rubric, benchmarks the task across oracle plus Claude and Codex (with and without skills), audits trajectories for cheating and skill invocation, and produces a `pr-N-task-timestamp-run.txt` review report alongside a `prN.zip` bundle of trajectories. Use when reviewing a SkillsBench task PR (by number, branch, or local task path), when the user asks to review a task, run benchmarks on a PR, audit a submission, classify a task as research or multimodal track, or prepare a comment to post on a SkillsBench PR.

$ git log --oneline --stat
stars:1.272
forks:307
updated:5. Mai 2026 um 16:58
Datei-Explorer
13 Dateien
SKILL.md
readonly