Skip to main content
Run any Skill in Manus
with one click

llm-evaluation

LLM evaluation and testing patterns including prompt testing, hallucination detection, benchmark creation, and quality metrics. Use when testing LLM applications, validating prompt quality, implementing systematic evaluation, or measuring LLM performance.

Overview

LLM evaluation and testing patterns including prompt testing, hallucination detection, benchmark creation, and quality metrics. Use when testing LLM applications, validating prompt quality, implementing systematic evaluation, or measuring LLM performance.

Install command
npx skills add https://github.com/applied-artificial-intelligence/claude-code-toolkit --skill llm-evaluation

Copy and paste this command into Claude Code to install the skill

Stars72
Forks19
UpdatedNovember 1, 2025 at 23:59
SKILL.md
readonly