Skip to main content
Ejecuta cualquier Skill en Manus
con un clic

agent-evaluation

Estrellas57
Forks18
Actualizado29 de mayo de 2026, 19:29

Use this when you need to EVALUATE OR IMPROVE or OPTIMIZE an existing LLM agent's output quality - including improving tool selection accuracy, answer quality, reducing costs, or fixing issues where the agent gives wrong/incomplete responses. Evaluates agents systematically using MLflow evaluation with datasets, scorers, and tracing. IMPORTANT - Always also load the instrumenting-with-mlflow-tracing skill before starting any work. Covers end-to-end evaluation workflow or individual components (tracing setup, dataset creation, scorer definition, evaluation execution).

Instalación

Instalar con Codex o Claude Copia este prompt, pégalo en Codex, Claude u otro asistente, y deja que revise la página de la skill y la instale por ti.

Explorador de archivos
18 archivos
SKILL.md
readonly