with one click
payload-snapshot
// Snapshot OpenShift payload data (release controller, PR diffs, comments, CI jobs, JUnit results, regression tracking) to a local directory for offline analysis
// Snapshot OpenShift payload data (release controller, PR diffs, comments, CI jobs, JUnit results, regression tracking) to a local directory for offline analysis
| name | payload-snapshot |
| description | Snapshot OpenShift payload data (release controller, PR diffs, comments, CI jobs, JUnit results, regression tracking) to a local directory for offline analysis |
This skill downloads all data needed to analyze an OpenShift payload into a local directory tree. The resulting snapshot can be navigated entirely via file reads — no live API calls required during analysis.
Use this skill when you need to:
Python 3 (3.10 or later)
GitHub CLI (gh) — for PR diff, comment, and job data
brew install gh (macOS) or see https://cli.github.comgh auth logingh, release controller data is still fetched; PR data is skippedGoogle Cloud SDK (gcloud) — for JUnit test result download
brew install google-cloud-sdk (macOS) or see https://cloud.google.com/sdkgcloud auth logingcloud, JUnit data is skipped; job directories still createdNetwork access to:
*.ocp.releases.ci.openshift.org (release controller)api.github.com (via gh CLI)storage.googleapis.com (via gcloud CLI)script_path="plugins/ci/skills/payload-snapshot/scripts/payload_snapshot.py"
# Snapshot a specific payload
python3 "$script_path" 4.22.0-0.nightly-2026-02-25-152806
# Custom output directory
python3 "$script_path" 4.22.0-0.nightly-2026-02-25-152806 --output-dir .work/snapshot
# Limit chain depth
python3 "$script_path" 4.22.0-0.nightly-2026-02-25-152806 --max-chain 5
# Skip JUnit download (faster, still generates job structure and summary)
python3 "$script_path" 4.22.0-0.nightly-2026-02-25-152806 --no-junit
The script will:
ghThe output directory is structured for easy navigation:
payload/
<version>/
<stream>/
summary.json # START HERE — full triage data
CLAUDE.md # Imports AGENTS.md for Claude Code
AGENTS.md # Dynamic snapshot orientation doc
streams.json # All streams for this version
<tag>/ # Each payload in the chain
payload.json # Release controller API response
changelog.json # PRs that changed vs. previous payload
regressions.json # Test failure regression tracking
jobs/
blocking/
<job-name>/
job.json # Job metadata (state, URLs, GCS link, retries)
build_log.json # Error/warning lines + log tail (failed only)
junit/ # Only for failed jobs
junit_operator.xml # CI phase results
junit-aggregated.xml # Aggregated jobs only
results.json # Parsed test failures (full output)
informing/
<job-name>/
job.json # Job metadata only (no JUnit/build log)
<component>/ # e.g., machine-config-operator
prs/
<pr_number>/
code.diff # Git diff of the PR
comments.json # PR comments and reviews
jobs.json # CI check runs
Find failed blocking jobs (with streaks):
jq '.blocking_jobs.failed_jobs[] | {name, state, streak: .streak.streak_length, pattern: .streak.failure_pattern}' payload/<version>/<stream>/summary.json
Check test failures and when they started:
jq '.[] | {test: .test_name, first_failed: .first_failed_in, payloads: .payloads_failing, jobs: .jobs}' payload/<version>/<stream>/<tag>/regressions.json
List PRs in a payload:
jq '.changeLogJson.updatedImages[].commits[] | {component: .name, pr: .pullURL, subject: .subject}' payload/<version>/<stream>/<tag>/changelog.json
Read a specific PR's diff:
cat payload/<version>/<stream>/<tag>/<component>/prs/<number>/code.diff
Check JUnit failures for a specific job:
jq '.[].name' payload/<version>/<stream>/<tag>/jobs/blocking/<job-name>/junit/results.json
python3 payload_snapshot.py <payload_tag> [OPTIONS]
Positional:
payload_tag Payload tag (e.g., 4.22.0-0.nightly-2026-02-25-152806)
Options:
--output-dir DIR Base output directory (default: payload)
--max-chain N Maximum backward chain depth (default: 20)
--workers N Parallel workers for API calls (default: 8)
--no-junit Skip JUnit download and regression tracking
streams.jsonLists all available streams for the payload's version.
summary.jsonComprehensive stream-level triage data — start here. Contains:
payload_tag, phase, release_url, architecture, stream, versionchain_length, baseline_tag, hours_since_baselineblocking_jobs.failed_jobs[] — detailed objects with name, state, prow_url, gcs_url, and relative path job_json. May include: rhcos_version, streak (with streak_length, originating_payload, is_new_failure, failure_pattern), build_log_errors, test_failure_count, and relative paths junit_results, build_loginforming_jobs.failed_jobs[] — job name stringstest_failures.blocking[] — test_name, jobs, first_failed_in, payloads_failing, failure_message, failure_text (full, not truncated)payloads[] — per-payload entries with tag, phase, relative file paths, and prs[] with component/diff/comments pathsAGENTS.md / CLAUDE.mdDynamic orientation document generated at snapshot time. Contains the specific payload tag, chain, failed jobs, file layout, key concepts, and summary.json schema. CLAUDE.md imports AGENTS.md via @AGENTS.md.
payload.jsonFull release controller response including blockingJobs, informingJobs, and asyncJobs with their states, Prow URLs, and retry attempt URLs.
changelog.jsonRelease controller diff response with changeLogJson.updatedImages listing every PR that changed between this payload and its predecessor.
regressions.jsonPer-payload regression tracking data. For each failing test in the target payload:
test_name: the failing testjobs: which jobs it fails infirst_failed_in: the earliest payload in the chain where it was failingpayloads_failing: how many consecutive payloads it has been failingfailure_message: the error messagefailure_text: full failure outputjob.jsonPer-job metadata including name, state, lifecycle (blocking/informing), Prow URL, GCS browser URL (gcs_url), retry count, whether it's an aggregated job, GCS bucket path, and rhcos_version.
The rhcos_version field is determined from the job name and OCP version:
rhcos9_10 — heterogeneous cluster (mixed RHCOS 9 and 10 node pools)rhcos10 — RHCOS 10 onlyrhcos9 — RHCOS 9 (explicit)rhcos9-default — no explicit fragment; currently defaults to RHCOS 9 when no fragment is presentbuild_log.json (failed blocking jobs only)Extracted from build-log.txt in GCS (handles gzip decompression). Contains:
total_lines: total line count of the build logerror_warning_count: number of lines matching error/warning patternserror_warning_lines[]: each with line_number and texttail_start_line, tail_lines[]: last 20% of the log for contextresults.json (in junit/ subdirectory)Parsed JUnit test failures for a specific job. Only includes failed/error tests. For aggregated jobs, includes per-run pass/fail/skip data with Prow URLs for each run.
code.diff, comments.json, jobs.jsonPR artifacts from GitHub (unchanged from previous version).
The script chains backwards from the target payload until it finds a payload where all blocking jobs succeeded. This is stricter than the Accepted phase — a payload can be force-accepted with failed blocking jobs, which does not count as a stop point.
For terminal payloads (Accepted/Rejected), jobs showing Pending on the release controller are cross-checked against the actual Prow prowjob.json artifact to get their real state.
Aggregated jobs run the same underlying test multiple times with statistical analysis. The script:
aggregated- name prefixjunit-aggregated.xml which contains per-run pass/fail/skip data<system-out> to extract individual run URLsgh not authenticated: Prints a warning and continues without PR datagcloud not available: Prints a warning and skips JUnit downloadjob.json but no JUnit or build log)--workers flag controls parallelism for all subprocess calls (default 8)fetch-payloads (fetches recent payloads from the release controller)fetch-new-prs-in-payload (fetches PRs new in a specific payload)payload-analysis (analyzes a payload snapshot for revert candidates)Analyze a payload snapshot to identify root causes of blocking job failures, score candidate PRs, and produce an HTML report with revert recommendations
Query and deduplicate open CVE vulnerability issues from OCPBUGS for Node team components
Generate triage reports and post findings to Jira and Slack
Schema for the autodl JSON data file produced by payload-analysis for database ingestion — you must use this skill whenever generating the autodl JSON file
State management for agentic payload triage actions — you must use this skill whenever reading or writing the payload results YAML file
Create TRT JIRA bugs, open revert PRs, and trigger payload jobs for high-confidence revert candidates