Skip to main content
Run any Skill in Manus
with one click
$pwd:

evaluation

// This skill should be used when building agent evaluation systems: deterministic checks, regression suites, multi-dimensional rubrics, quality gates, production monitoring, baseline comparison, and outcome measurement for agent pipelines.

$ git log --oneline --stat
stars:15,902
forks:1,286
updated:May 19, 2026 at 06:08
File Explorer
3 files
SKILL.md
readonly