shihongDev Agent Skills

skill

occupation

description

updated

software-quality-assurance-analysts-and-testers

Use when LLM judges need calibration, evaluation metrics seem misaligned with expectations, or annotation and judge tuning is needed

2026-04-07

evalyn-eval

software-quality-assurance-analysts-and-testers

Use when building evaluation datasets, selecting metrics, or running evaluations on an LLM agent project with evalyn

2026-04-07

evalyn

software-quality-assurance-analysts-and-testers

Use to evaluate an LLM agent with evalyn. Orchestrates the full pipeline: install, instrument, trace, build dataset, suggest metrics, run eval, analyze, calibrate.

2026-04-07

evalyn-setup

software-quality-assurance-analysts-and-testers

Use when setting up evalyn evaluation for an LLM agent project, instrumenting agent code, or adding the evalyn decorator

2026-03-22

evalyn-analyze

software-quality-assurance-analysts-and-testers

Use when analyzing evalyn evaluation results, investigating failures, comparing runs, or understanding agent performance

2026-03-08

shihongDev

Where the skills live

Repositories and representative skills