Skip to main content
Jeden Skill in Manus ausführen
mit einem Klick

tune-ci-thresholds

// Run CI tests N times per stage on the H20 CI-reproduction host, produce a per-metric worst-of-N observation report, and (on user confirmation) write the worst-of-N values back into the test files as new baselines. Use when recalibrating CI thresholds after an engine update. Currently supports qwen3-omni-v1 and s2-pro-v1; extensible via models/<name>/config.yaml.

$ git log --oneline --stat
stars:296
forks:131
updated:23. Mai 2026 um 21:24
Datei-Explorer
6 Dateien
SKILL.md
readonly