Skip to main content
Run any Skill in Manus
with one click
$pwd:

llm-evaluation

// Implement comprehensive evaluation strategies for LLM applications using automated metrics, human feedback, and benchmarking. Use when testing LLM performance, measuring AI application quality, or establishing evaluation frameworks.

$ git log --oneline --stat
stars:35,133
forks:3,821
updated:March 7, 2026 at 15:53
SKILL.md
readonly