Skip to main content
Run any Skill in Manus
with one click
$pwd:

eval-authoring

// Author programmatic eval cases for the playmoleculeai-evals repo (cases.yaml, checks.py, fixtures, dev scripts) grounded in real agent traces. Use whenever the user is adding, designing, drafting, refining, or grading an eval case in this project, including phrases like "new eval", "add a case", "wire up a case", "design a test for chat N", any case-id like `xxx-NNNN`, or work touching cases.yaml, checks.py, the RunArtifact API, or the development/dump_chat.py / test_case_*.py dev scripts. Invoke this skill before reading checks.py or starting a new case.

$ git log --oneline --stat
stars:0
forks:0
updated:May 6, 2026 at 13:15
SKILL.md
readonly