ワンクリックで
rota-bench-regression-analysis
// Analyze recent GraalPy benchmark regressions on `master` as part of the weekly rota. Use when asked to analyze benchmarks for rota.
// Analyze recent GraalPy benchmark regressions on `master` as part of the weekly rota. Use when asked to analyze benchmarks for rota.
Create or update GraalPy third-party package compatibility patches under graalpython/lib-graalpython/patches, including PyPI source preparation, rebasing existing patches, metadata.toml updates, license checks, version-range validation, and verify_patches.py validation.
Analyze current GraalPy periodic job failures for ROTA. Use when asked to triage, summarize, or plan work for current periodic job failures, starting from scripts/rota_ci_failures.py output, validating linked Jira issues, inspecting logs, forming hypotheses, reproduction commands, and implementation order.
Mirror an external GitHub pull request into the internal GraalPython Bitbucket review flow, including OCA label checks, Jira creation or reuse, preserving PR commits, pre-commit cleanup, and handoff to the shared GraalPython Bitbucket PR flow for PR creation, Graal Bot tasks, gates, and fixes.
Create or continue a GraalPython Bitbucket pull request and drive it through Graal Bot tasks, gate start, gate monitoring, failure investigation, fixes, pushes, and gate reruns. Use after a branch is ready for internal GraalPython review, or when an automation command has already created the PR and the remaining work is task cleanup and gate follow-up.
Run the GraalPy ROTA import update workflow. Use when asked to refresh imports, create the standard Graal import update pull request, inspect generated commits, and hand off to the shared GraalPython Bitbucket PR flow for tasks, gates, and failure fixes.
Check gate status for a Bitbucket PR by resolving the PR head commit, finding the gate merge commit, inspecting builds on that merge commit, and summarizing root-cause failures with actionable next steps.
| name | rota-bench-regression-analysis |
| description | Analyze recent GraalPy benchmark regressions on `master` as part of the weekly rota. Use when asked to analyze benchmarks for rota. |
scripts/compare_bench_regressions.py --rota.scripts/compare_bench_regressions.py --rota --json-out /tmp/compare_bench_regressions_rota.json
plausible regressions. Ignore flaky and inconclusive items unless they help explain a plausible shift.AttributedTo bisectTo watchwrite_stdin and 1 hour timeout (the configuration might cap this at a lower timeout in practice)jq '.direct_suspects[] | {good_commit, bad_commit, bad_author_email, bad_subject}' \
/tmp/compare_bench_regressions_rota.json
jq '[.change_points[] | select(.classification == "plausible")]' \
/tmp/compare_bench_regressions_rota.json
direct_suspects in /tmp/compare_bench_regressions_rota.json.micro, meso, macro...change_points whose (good_commit, bad_commit] pair is not already covered by direct_suspects.git log --first-parent --reverse --format='%H%x09%ae%x09%s' GOOD..BAD
git show --stat --summary --format=fuller BAD
git diff --stat GOOD..BAD
git show --stat --summary --format=fuller COMMIT.mx.graalpython/suite.py imports did not change, it can usually be attributed to that commit.mx.graalpython/suite.py can never be confidently attributed without bisecting Graal. Keep those unattributed and say so explicitly (including the graal commit range)native shows an exact jump on one commit and jvm later shows the same benchmark shifted upward through a range containing that commit, treat them as likely the same cause unless the later range has a better candidate.jvm and native jump at or immediately after a Graal import update, keep both under the same unattributed Graal-side cause.bench-cli run - and check whether the change is a clean step up that stays high, a one-point spike that immediately falls back, or already present before the reported range.branch = master, target benchmark, host-vm = graalvm-ee, target host-vm-config, guest-vm = graalpython, target guest-vm-config, metric.name = time, commit.committer-ts last-n 30d.commit.rev and average metric value so the step pattern is easy to inspect.bench-cli sometimes fails with 404 when the server is overloaded. If that happens, wait for a minute and try againAttributed and To bisect and To watchabcd1234efgh | author@oracle.com | Full subjectscripts/bisect_benchmark_regression.py that can bisect it (use unabbreviated commits in this case)bench-cli, ask the user to provide it from the bench-server repo.
INFRA and the repo is called bench-server