Skip to main content
Exécutez n'importe quel Skill dans Manus
en un clic

evaluation-harness

Builds repeatable evaluation systems with golden datasets, scoring rubrics, pass/fail thresholds, and regression reports. Use for "LLM evaluation", "testing AI systems", "quality assurance", or "model benchmarking".

Aperçu

Builds repeatable evaluation systems with golden datasets, scoring rubrics, pass/fail thresholds, and regression reports. Use for "LLM evaluation", "testing AI systems", "quality assurance", or "model benchmarking".

Commande d'installation
npx skills add https://github.com/patricio0312rev/skillset --skill evaluation-harness

Copiez et collez cette commande dans Claude Code pour installer le skill

Étoiles5
Forks0
Mis à jour31 décembre 2025 à 05:05
SKILL.md
readonly