con un clic
nat-evaluation
// Use when designing, configuring, running, or troubleshooting NeMo Agent Toolkit evaluations, datasets, evaluator selection, ATIF surfaces, quality gates, custom evaluators, and `nat eval`.
// Use when designing, configuring, running, or troubleshooting NeMo Agent Toolkit evaluations, datasets, evaluator selection, ATIF surfaces, quality gates, custom evaluators, and `nat eval`.
Use before creating, editing, or deciding whether to update any AI coding agent skill in this repository, including corrections to existing skill behavior, references, or routing.
Use when selecting, configuring, composing, or troubleshooting NeMo Agent Toolkit agents and control-flow components, including ReAct, tool-calling, ReWOO, reasoning, router, sequential, parallel, and sub-agent patterns.
Use when installing or configuring NVIDIA NeMo Agent Toolkit, verifying the `nat` CLI, setting up optional extras, or creating a first hello-world workflow.
Use when serving NeMo Agent Toolkit workflows, exposing workflows through FastAPI, configuring MCP clients or servers, or troubleshooting transport and server setup.
Use when configuring or running NeMo Agent Toolkit optimization with `nat optimize`, including Optuna parameter tuning, prompt evolution, optimizer sizing, output interpretation, and optimizer datasets.
Use when fixing NeMo Agent Toolkit documentation path-check failures, especially failed `ci/scripts/path_checks.py` output, slash-delimited text mistaken for paths, relative path references, Markdown code escaping, and path-check allowlist decisions.
| name | nat-evaluation |
| description | Use when designing, configuring, running, or troubleshooting NeMo Agent Toolkit evaluations, datasets, evaluator selection, ATIF surfaces, quality gates, custom evaluators, and `nat eval`. |
| author | NVIDIA Corporation and Affiliates |
| license | Apache-2.0 |
Use this skill for measuring agent quality and behavior.
nat eval and inspect generated artifacts.references/operating-mode.mdreferences/methodology.mdreferences/agent-eval-framework.mdreferences/evaluation-surfaces.mdreferences/evaluation-contract.mdreferences/evaluators/references/code-patterns.md