Skip to main content
Ejecuta cualquier Skill en Manus
con un clic

tune-ci-thresholds

// Run CI tests N times per stage on the H20 CI-reproduction host, produce a per-metric worst-of-N observation report, and (on user confirmation) write the worst-of-N values back into the test files as new baselines. Use when recalibrating CI thresholds after an engine update. Currently supports qwen3-omni-v1 and s2-pro-v1; extensible via models/<name>/config.yaml.

$ git log --oneline --stat
stars:296
forks:131
updated:23 de mayo de 2026, 21:24
Explorador de archivos
6 archivos
SKILL.md
readonly