Skip to main content
在 Manus 中运行任何 Skill
一键导入
$pwd:

llm-evaluation

// Implement comprehensive evaluation strategies for LLM applications using automated metrics, human feedback, and benchmarking. Use when testing LLM performance, measuring AI application quality, or establishing evaluation frameworks.

$ git log --oneline --stat
stars:34,024
forks:3,690
updated:2026年3月7日 15:53
SKILL.md
readonly