Run any Skill in Manus with one click

Get Started

$pwd:

analyze-nightly

Name: Analyze Nightly
Author: tenstorrent

// Analyze a GitHub Actions run and summarize failures

Run Skill in Manus

$ git log --oneline --stat

stars:64

forks:27

updated:April 30, 2026 at 05:27

SKILL.md

readonly

name	analyze-nightly
description	Analyze a GitHub Actions run and summarize failures
disable-model-invocation	false
allowed-tools	Read, Read(/tmp/), Write(/tmp/), Glob, Grep, Bash(git clone ), Bash(gh run view ), Bash(gh run list ), Bash(gh run download ), Bash(gh pr view ), Bash(tee ), Bash(gh api ), Bash(gh api > /tmp/), Bash(wc -l /tmp/), Bash(jq ), Bash(mkdir -p /tmp/), Bash(rm -rf /tmp/), Bash(for )
context	fork
argument-hint	run-id [save]
model	opus

Create a summary of test failures or job failures, grouped by ownership area:

Ownership area: PJRT unit tests.
- PJRT tests are rooted under various directories in pjrt_implementation/.
Ownership area: vLLM integration and multi-host execution.
- vLLM integration tests are rooted in tests/integrations/vllm_plugin/.
- vLLM integration is rooted in integrations/vllm_tt/.
- Multi-host execution PyTorch tests are rooted in tests/torch/multi_host/.
Ownership area: Performance benchmarks.
- Performance benchmarks are rooted in tests/benchmark/.
Ownership area: Examples.
- Examples are rooted under examples/.
- Test which runs examples is in tests/examples/test_examples.py.
Ownership area: PyTorch and JAX single-chip and multi-chip tests.
- This is a catch-all category for tests rooted under tests/. Important note: "test" refers broadly to any test run, performance benchmark run, tt-xla project demo run, or tt-xla project example run.

Analyze the run with run-id $0 and create a summary of test failures:

Tests/jobs are run in the GitHub Actions workflow with run-id $0.
GitHub repo associated with the workflow is github.com/tenstorrent/tt-xla.
Read the workflow file .github/workflows/schedule-nightly.yml to build context around which jobs are executed in the workflow. Jobs mostly invoke other workflow *.yml files; read other worklow *.yml files that are referenced from .github/workflows/schedule-nightly.yml. Some .yml files may be in another repository, namely github.com/tenstorrent/tt-forge. To access files in other repositories, use git to clone them to /tmp/. Ensure that these cloned files are deleted after you complete executing all your other tasks.
For the run with run-id $0 fetch all job-ids by using the gh CLI tool. Run gh run view $0 --json jobs --jq '.jobs[].url' to fetch all job URLs, which have the following format, from which you can extract {job-id}: https://github.com/tenstorrent/tt-xla/actions/runs/{run-id}/job/{job-id}
Fetch details for every job by using the gh api subcommand. Discard any job that was successfully completed, canceled, skipped, or is still in progress. Focus only on jobs that failed!
For each failed job, identify which step(s) failed. If multiple steps failed, focus on the first one, assuming the first one is the one that cause subsequent step failures. Fetch and read raw logs for the failed step by using the gh CLI tool. Analyze the logs in search for error messages, failure messages, timeout messages, or any other text indicating a root cause for the failure of that step of the job. Keywords to look for are error, assert, assertion, failed, throw, failure, fatal, timeout, timed out, HTTP error codes from 400 and 500 range, Linux exit codes corresponding to signals processes can receive, etc. Don't limit yourself to only these keywords, there may be others that indicate a root cause, these are just the most common ones.
Strategies for how to improve log crawling: (a) Always use case insensitive pattern matching for specific words or phrases. (b) If the logs are truncated or too large, download them to /tmp/ in a temporary directory. Ensure that this temporary directory is deleted after you complete executing all your other tasks. (c) If a log exceeds 5000 lines, first try to use Grep to search for keywords rather than reading the full log. Only if you fail to identify a root cause with Grep, read the whole log. (d) if you identify a root cause, searching backward through the log will in most cases yield a line of text identifying the specific test that failed.
Job steps (and therefore their logs) that are responsible for running tests are always running multiple tests, not just one. Tests are in most cases either a pytest command specifying a parametrized test, or model tests invoked by the tests/runner/test_models.py script with the model name as the parameter in brackets. The same test may run on different hardware architectures, but never in the same job, so it is possible for the same test to fail in multiple jobs with the same root cause.
Gather information relevant to the failure: name of the test and/or model that failed (or job step name that failed if previous is not applicable), root cause, hardware architecture (if applicable). If the same test fails with the same root cause on multiple architectures, list it once with all affected architectures in the {arch-list}.
Summarize which tests or job steps have failed. If a job step failed because a test failed, present it as a test failure. If a job step failed before any test is run, present it as a job failure unrelated to test execution. Group all failures by ownership area. If an ownership area has no failures, do not emit any text for that area in the output.

Output format that you need to follow (raw Markdown text):

# {ownership-area-name}

## {root-cause-1}
- {test-or-step-name} ({arch-list}) -> [job-link]({url})
- {test-or-step-name} ({arch-list}) -> [job-link]({url})

## {root-cause-2}
- {test-or-step-name} ({arch-list}) -> [job-link]({url})
- {test-or-step-name} ({arch-list}) -> [job-link]({url})

# {ownership-area-name}

## {root-cause-1}
- {test-or-step-name} ({arch-list}) -> [job-link]({url})
- {test-or-step-name} ({arch-list}) -> [job-link]({url})

## {root-cause-2}
- {test-or-step-name} ({arch-list}) -> [job-link]({url})
- {test-or-step-name} ({arch-list}) -> [job-link]({url})

After producing the Markdown summary:

If $1 equals "save" (i.e. this skill was invoked as a subskill by another skill), write the full Markdown output to /tmp/nightly-analysis-$0.md using the Write tool, then emit one final line: Analysis written to: /tmp/nightly-analysis-$0.md
If $1 is absent or empty (i.e. this skill was invoked directly by the user), only emit the Markdown summary to the conversation. Do not write any files.

Always respect these additional constraints:

Never ask for, and never run any gh {subcommand} commands that may modify the state of the GitHub repository (for example issue creation/deletion, PR closing, branch manipulation etc.), especially when using the gh api subcommand. Always use only read-only calls!
Never execute multiple commands separated by a semi-colon!
Always use for loops in bash for executing commands for different job-ids! Never execute the same command separately for multiple job-ids!

related-skills.json

same repository

triage-dtype-bfloat16.md

from "tenstorrent/tt-xla"

Triage one tt-forge-models training test failing with a bfloat16 dtype-mismatch RuntimeError (e.g. "mat1 and mat2 must have the same dtype, but got Float and BFloat16", "'<op>' not implemented for 'BFloat16'"). For cross-dtype operands, attempts a minimal loader fix propagating `dtype_override` into the offending tensor constructor, then re-runs CPU + pytest and updates the YAML (passing -> EXPECTED_PASSING; new failure -> KNOWN_FAILURE_XFAIL). For op-not-implemented (no PyTorch kernel), goes straight to KNOWN_FAILURE_XFAIL with the verbatim error. Updates every training entry sharing the affected loader. Never edits inference YAML or `dynamic_loader.py`.

2026-05-2564

triage-unpack-forward-output.md

from "tenstorrent/tt-xla"

Triage one tt-forge-models training test stuck at FAILED_FE_COMPILATION with reason "tt-forge-models doesn't implement unpack_forward_output for this model." Inspects the model's forward output, registers a handler or writes a per-loader override, and updates the YAML.

2026-05-1464

ci-benchmark-analyzer.md

from "tenstorrent/tt-xla"

Analyze CI benchmark workflow runs from GitHub Actions for the tt-xla project. Produces a markdown report covering failed jobs (with root-cause error extraction via logs and Glean), successful model performance metrics (samples/sec, TTFT, device perf), perf regressions/improvements vs previous nightly, and the full dependency commit chain (tt-xla, tt-mlir, tt-metal). Use this skill whenever the user wants to analyze a CI run, review nightly benchmark results, investigate CI failures, check benchmark performance from a workflow run, or asks about "latest nightly" results. Also trigger when the user pastes a GitHub Actions run URL or mentions a run ID in the context of performance analysis, or asks about perf regressions.

2026-05-0664

finding-missed-fusions.md

from "tenstorrent/tt-xla"

Use when auditing a TTNN model's IR for missed op fusion opportunities — both direct TTNN fusions (a fused ttnn op already exists) and theoretical fusions (the pattern is a single kernel in torch/triton/cuda)

2026-05-0664

graph-break-analysis.md

from "tenstorrent/tt-xla"

Analyzes, debugs and proposes fixes for graph breaks in PyTorch/XLA model compilation. Use when a model generates more graphs than expected during compilation, the user mentions "graph break", or when debugging excessive graph generation in tt-xla pipelines.

2026-04-2164

code-reviewer.md

from "tenstorrent/tt-xla"

Code review skill specialized for tt-xla (Python + C++ PJRT plugin for Tenstorrent hardware). Covers C++ memory safety, PJRT API patterns, Python test standards, and project-specific conventions.

2026-03-2764

package.json

"author": "tenstorrent"

"repository": "tenstorrent/tt-xla"

View GitHub Repository View Creator Repositories

$ install --global

$ download --local

Run Skill in Manus

$ useful --forSOC

Software Quality Assurance Analysts and TestersComputer and Mathematical Occupations15-1253L4

# {ownership-area-name} ## {root-cause-1} - {test-or-step-name} ({arch-list}) -> [job-link]({url}) - {test-or-step-name} ({arch-list}) -> [job-link]({url}) ## {root-cause-2} - {test-or-step-name} ({arch-list}) -> [job-link]({url}) - {test-or-step-name} ({arch-list}) -> [job-link]({url}) # {ownership-area-name} ## {root-cause-1} - {test-or-step-name} ({arch-list}) -> [job-link]({url}) - {test-or-step-name} ({arch-list}) -> [job-link]({url}) ## {root-cause-2} - {test-or-step-name} ({arch-list}) -> [job-link]({url}) - {test-or-step-name} ({arch-list}) -> [job-link]({url})

analyze-nightly

More from this repository

More from this repository