Skip to main content
Ejecuta cualquier Skill en Manus
con un clic

advanced-evaluation

// This skill should be used when the user asks to "implement LLM-as-judge", "compare model outputs", "create evaluation rubrics", "mitigate evaluation bias", or mentions direct scoring, pairwise comparison, position bias, evaluation pipelines, or automated quality assessment.

$ git log --oneline --stat
stars:34,395
forks:5,691
updated:13 de abril de 2026, 22:14
SKILL.md
readonly