A comprehensive analysis skill for summarizing HAMi vGPU metrics from Prometheus-style `/metrics` output. It organizes GPU allocation by node, device, pod, and namespace, and produces clear reports covering vGPU core allocation, memory allocation, allocation-based utilization, sharing density, and namespace-level usage patterns.
Use when pods are stuck in Pending, CrashLoopBackOff, or ImagePullBackOff state. Performs event-driven triage to quickly identify root cause, then deep-dives into scheduling failures, resource exhaustion, image pull errors, and crash loops.