Skip to main content
Manus에서 모든 스킬 실행
원클릭으로

llm-evals-audit

// Use this skill when a developer wants to check whether their existing evaluations are trustworthy and well-targeted. Triggers on: "audit my evals", "are my evals any good", "review my evaluation setup", "check my LLM judges", "are my evaluations reliable", "something feels off with my eval scores", "inherited an eval system", "my evals are passing but the product feels broken", "post-build eval check", "validate my eval pipeline". Inspects existing eval artifacts — judge prompts, annotation data, issue reports, alignment scores — and produces a prioritized findings report with a concrete fix for each problem. Do NOT use this to build new evals from scratch — use llm-evals-checklist first, then llm-judge-creator.

$ git log --oneline --stat
stars:14
forks:2
updated:2026년 4월 23일 14:31
SKILL.md
readonly