Skip to main content
Ejecuta cualquier Skill en Manus
con un clic

evaluation-harness

Builds repeatable evaluation systems with golden datasets, scoring rubrics, pass/fail thresholds, and regression reports. Use for "LLM evaluation", "testing AI systems", "quality assurance", or "model benchmarking".

Resumen

Builds repeatable evaluation systems with golden datasets, scoring rubrics, pass/fail thresholds, and regression reports. Use for "LLM evaluation", "testing AI systems", "quality assurance", or "model benchmarking".

Comando de instalación
npx skills add https://github.com/patricio0312rev/skillset --skill evaluation-harness

Copia y pega este comando en Claude Code para instalar la habilidad

Estrellas5
Forks0
Actualizado31 de diciembre de 2025, 05:05
SKILL.md
readonly