Skip to main content
تشغيل أي مهارة في Manus
بنقرة واحدة

eval-writer

// Create new eval suites for the deepagentsjs monorepo. Handles dataset design, test case scaffolding, scoring logic, vitest configuration, and LangSmith integration. Use when the user asks to: (1) create an eval, (2) write an evaluation, (3) add a benchmark, (4) build an eval suite, (5) evaluate agent behaviour, (6) add test cases for a capability, or (7) implement an existing benchmark (e.g. oolong, AgentBench, SWE-bench). Trigger on phrases like 'create eval', 'new eval', 'add eval', 'benchmark', 'evaluate', 'eval suite', 'write evals for'.

$ git log --oneline --stat
stars:١٬٢٥٢
forks:٢٠٠
updated:١٧ مارس ٢٠٢٦ في ١٧:٠٦
SKILL.md
readonly