Skip to main content
Execute qualquer Skill no Manus
com um clique

evaluation-harness

Builds repeatable evaluation systems with golden datasets, scoring rubrics, pass/fail thresholds, and regression reports. Use for "LLM evaluation", "testing AI systems", "quality assurance", or "model benchmarking".

Visão geral

Builds repeatable evaluation systems with golden datasets, scoring rubrics, pass/fail thresholds, and regression reports. Use for "LLM evaluation", "testing AI systems", "quality assurance", or "model benchmarking".

Comando de instalação
npx skills add https://github.com/patricio0312rev/skillset --skill evaluation-harness

Copie e cole este comando no Claude Code para instalar a skill

Estrelas5
Forks0
Atualizado31 de dezembro de 2025 às 05:05
SKILL.md
readonly