Skip to main content
Run any Skill in Manus
with one click

agentic-bench

Stars5
Forks0
UpdatedFebruary 21, 2026 at 05:58

Autonomous model validation and benchmarking. Investigates any ML model (LLM, image gen, TTS, time series, etc.), runs it on GPU cloud, evaluates quality and performance, and generates HTML reports. Use when user asks to verify, benchmark, evaluate, or test a model. Triggers on "verify model", "benchmark", "evaluate model", "test model", "run benchmark", "model evaluation", "モデルを検証", "ベンチマーク", "モデルを試して".

Installation

Install with Codex or Claude Copy this prompt, paste it into Codex, Claude, or another assistant, and let it review the skill page and install it for you.

SKILL.md
readonly