Skip to main content
Exécutez n'importe quel Skill dans Manus
en un clic

skill-forge-benchmark

// Benchmark Claude Code skill performance with variance analysis, tracking pass rate, execution time, and token usage across iterations. Runs multiple trials per eval for statistical reliability, aggregates results into benchmark.json, and generates comparison reports between skill versions. Use when user says "benchmark skill", "measure skill performance", "skill metrics", "compare skill versions", "skill performance", "track skill improvement", "skill regression test", or "skill A/B test".

$ git log --oneline --stat
stars:58
forks:28
updated:6 mars 2026 à 16:30
SKILL.md
readonly