Skip to main content
在 Manus 中运行任何 Skill
一键导入

job-log-triage

// Triage MaxText training jobs from log files — failed, hanging, running, or completed. Use when the user asks why a job failed, wants to diagnose an error, sees a crash, hang, timeout, OOM, NCCL error, heartbeat timeout, wants to understand a job's status, or asks about bad/low/dropping TGS or throughput.

$ git log --oneline --stat
stars:28
forks:3
updated:2026年5月9日 16:52
SKILL.md
readonly