Skip to main content
Jeden Skill in Manus ausführen
mit einem Klick

evaluation-harness

Builds repeatable evaluation systems with golden datasets, scoring rubrics, pass/fail thresholds, and regression reports. Use for "LLM evaluation", "testing AI systems", "quality assurance", or "model benchmarking".

Überblick

Builds repeatable evaluation systems with golden datasets, scoring rubrics, pass/fail thresholds, and regression reports. Use for "LLM evaluation", "testing AI systems", "quality assurance", or "model benchmarking".

Installationsbefehl
npx skills add https://github.com/patricio0312rev/skillset --skill evaluation-harness

Kopieren Sie diesen Befehl und fügen Sie ihn in Claude Code ein, um den Skill zu installieren

Sterne5
Forks0
Aktualisiert31. Dezember 2025 um 05:05
SKILL.md
readonly