Skip to main content
Exécutez n'importe quel Skill dans Manus
en un clic
$pwd:

mlflow-evaluation

// MLflow 3 GenAI evaluation for agent development. Use when (1) writing mlflow.genai.evaluate() code, (2) creating @scorer functions, (3) building evaluation datasets from traces, (4) using built-in scorers (Guidelines, Correctness, Safety, RetrievalGroundedness), (5) analyzing traces for latency/errors/architecture, (6) optimizing agent context/prompts/token usage, (7) debugging evaluation failures. Covers the full eval workflow: trace analysis -> dataset building -> scorer creation -> evaluation execution.

$ git log --oneline --stat
stars:5
forks:7
updated:16 janvier 2026 à 12:14
Explorateur de fichiers
9 fichiers
SKILL.md
readonly