Skip to main content
在 Manus 中运行任何 Skill
一键导入

agent-evaluator

// Evaluate a subagent by running its scenario suite as parallel subagents, scoring results against observable criteria, and producing exact FIND/REPLACE edits for any failing behaviors. Use this skill whenever the user wants to evaluate, test, score, or quality-check a subagent — even if they don't use those exact words. Triggers on: "evaluate the agent", "run evals on [agent]", "assess agent performance", "score the agent", "check if the agent works", "run the eval suite", "test the agent", "quality-check [agent]", "is the agent working correctly", or after applying fixes "did that improve things?". Also triggers proactively after an agent file is edited and the user asks "does it work now?" or similar. Triggers: evaluate agent, run evals, assess agent, score agent, test agent, quality-check, eval suite, agent performance, agent working, improve agent, rerun evals, benchmark agent.

$ git log --oneline --stat
stars:0
forks:0
updated:2026年5月22日 11:39
文件资源管理器
5 个文件
SKILL.md
readonly