Skip to main content
Manusで任意のスキルを実行
ワンクリックで

advanced-evaluation

// This skill should be used when the user asks to "implement LLM-as-judge", "compare model outputs", "create evaluation rubrics", "mitigate evaluation bias", or mentions direct scoring, pairwise comparison, position bias, evaluation pipelines, or automated quality assessment.

$ git log --oneline --stat
stars:34,395
forks:5,691
updated:2026年4月13日 22:14
SKILL.md
readonly