| name | github-pr-sentinel |
| description | Monitor a GitHub pull request until it's merged, green, or blocked. Polls CI checks, review comments, and mergeability state continuously. Diagnoses failures, retries flaky checks up to 3 times, auto-fixes branch-related issues when possible, and stops only when user help is required. Use when asked to "monitor a PR", "watch CI", "handle review comments", "sentinel a PR", or keep an eye on failures and feedback. |
| compatibility | Any CLI agent (OpenCode, Claude, Codex, Cursor, etc.) |
| metadata | {"source":"https://github.com/openai/codex/tree/main/.codex/skills/babysit-pr","version":"1.0.0"} |
PR Sentinel
Monitor a GitHub pull request persistently until one of these terminal outcomes occurs.
Terminal Outcomes (stop when any is true)
- PR merged or closed ā stop immediately on confirmation.
- PR ready to merge ā CI green, no unaddressed review comments, not blocked on required review approval, no merge conflict risk.
- User help required ā CI infrastructure issues, exhausted flaky retries (3 cycles), permission problems, or ambiguous situations that cannot be resolved safely.
Do not stop merely because a snapshot returns idle while checks are still pending.
Inputs
Accept any of:
- No argument: infer PR from current branch (
--pr auto)
- PR number
- PR URL
Core Workflow
- When asked to "monitor"/"watch"/"babysit" a PR, start with
--watch (continuous mode) unless doing a one-shot diagnostic.
- Run the watcher script to snapshot PR/CI/review state.
- Inspect the
actions list in the JSON response.
- If
diagnose_ci_failure is present, inspect failed run logs and classify the failure.
- If branch-related: patch code locally, commit, and push.
- If
process_review_comment is present, inspect surfaced review items and decide whether to address them.
- If a review item is actionable and correct, patch code locally, commit, and push.
- If likely flaky/unrelated and
retry_failed_checks is present, rerun failed jobs with --retry-failed-now.
- If both actionable review feedback and
retry_failed_checks are present, prioritize review feedback first ā a new commit retriggers CI, so avoid rerunning flaky checks on the old SHA.
- On every loop, verify mergeability / merge-conflict status (e.g. via
gh pr view).
- After any push or rerun, immediately return to step 1 and continue polling on the updated SHA.
- If you paused
--watch to patch/commit/push, relaunch --watch yourself in the same turn after the push.
- Repeat polling until the PR is green + review-clean + mergeable,
stop_pr_closed appears, or a user-help-required blocker is reached.
- Keep consuming watcher output in the same turn while babysitting is active ā do not end the turn with a detached
--watch process.
Commands
One-shot snapshot
python3 .agents/skills/github-pr-sentinel/scripts/gh_pr_watch.py --pr auto --once
Continuous watch (JSONL)
python3 .agents/skills/github-pr-sentinel/scripts/gh_pr_watch.py --pr auto --watch
Trigger flaky retry cycle
python3 .agents/skills/github-pr-sentinel/scripts/gh_pr_watch.py --pr auto --retry-failed-now
Explicit PR target
python3 .agents/skills/github-pr-sentinel/scripts/gh_pr_watch.py --pr <number-or-url> --once
CI Failure Classification
Use gh commands to inspect failed runs before deciding to rerun:
gh run view <run-id> --json jobs,name,workflowName,conclusion,status,url,headSha
gh run view <run-id> --log-failed
Branch-related ā logs point to changed code (compile/test/lint/typecheck in touched areas):
- Patch code locally, commit with conventional format, push.
Flaky/unrelated ā transient infra issues (timeouts, runner failures, registry outages, rate limits):
Ambiguous ā do one manual diagnosis attempt before choosing rerun.
See references/heuristics.md for the full checklist.
Review Comment Handling
The watcher surfaces review items from:
- PR issue comments
- Inline review comments
- Review submissions (COMMENT / APPROVED / CHANGES_REQUESTED)
It surfaces feedback from trusted human reviewers (repo OWNER/MEMBER/COLLABORATOR + authenticated operator) and approved review bots.
On a fresh state file, existing pending review feedback is surfaced immediately (not only new comments after monitoring starts).
When a comment is actionable and correct
- Patch code locally.
- Commit:
fix: address PR review feedback (#<n>)
- Push to the PR head branch.
- Resume watching on the new SHA immediately ā do not stop after reporting the push.
- If monitoring was in
--watch mode, restart --watch immediately after the push.
When a comment is non-actionable
- Already resolved in GitHub ā safely ignore.
- Ambiguous ā record as handled, continue polling.
- Conflicts with user intent ā stop and ask.
Git Safety Rules
- Work only on the PR head branch.
- Avoid destructive git commands.
- Do not switch branches unless necessary to recover context.
- Before editing, check for unrelated uncommitted changes. If present, stop and ask the user.
- After each fix, commit and
git push, then re-run the watcher.
- If you interrupted
--watch to fix, restart --watch immediately after the push.
- Do not run multiple concurrent
--watch processes for the same PR.
- A push is not a terminal outcome ā continue monitoring unless a strict stop condition is met.
Commit message format
fix: fix CI failure on PR #<n>
fix: address PR review feedback (#<n>)
Monitoring Loop Pattern
- Run
--once.
- Read
actions.
- Check if PR is merged/closed ā if so, report terminal state and stop.
- Check CI summary, new review items, mergeability/conflict status.
- Diagnose CI failures; classify branch-related vs flaky.
- Process actionable review comments before flaky reruns when both are present.
- Retry failed checks only when
retry_failed_checks is present and you are not about to replace the current SHA.
- If you pushed a commit or triggered a rerun, report briefly and continue polling.
- After a review-fix push, proactively restart
--watch in the same turn.
- If everything is passing, mergeable, no unaddressed reviews, and not blocked on approval ā report success and stop.
- If blocked on user-help issue ā report the blocker and stop.
- Otherwise sleep per polling cadence and repeat.
Preferring --watch
When the user asks to monitor/watch/babysit a PR, use --watch for autonomous polling. Use --once only for debugging or one-shot checks.
Do not stop to ask whether to continue ā poll autonomously until a strict stop condition or explicit user interruption.
Polling Cadence (Adaptive)
- CI not green: poll every 1 minute.
- CI green: start at 1 min, back off exponentially on no change (1m ā 2m ā 4m ā 8m ā 16m ā 32m), cap at 1 hour.
- Reset to 1 min whenever anything changes (new SHA, check status change, new review, mergeability change).
- CI regresses (new commit, rerun failure): return to 1-minute polling.
- PR merged/closed: stop immediately.
Stop Conditions (Strict)
Stop only when:
| Condition | Action |
|---|
| PR merged or closed | Stop immediately |
| PR ready to merge (green + review-clean + mergeable) | Report success, stop |
| User intervention required (infra outage, exhausted retries, permissions) | Report blocker, stop |
Keep polling when:
actions contains only idle but checks are still pending.
- CI is still running/queued.
- Review state is quiet but CI is not terminal.
- CI is green but mergeability is unknown/pending.
- CI is green and mergeable but waiting on review approval.
- CI is green but merge-conflict risk may change.
Output Expectations
During monitoring
- Concise progress updates on status changes only.
- Occasional heartbeat during long unchanged periods.
- Push confirmations, intermediate CI snapshots, and review-action updates are progress only ā not final.
CI green celebration
One-time when CI first transitions to all green:
š CI is all green! 33/33 passed. Still on watch for review approval.
Final summary (when a stop condition is met)
Include:
- Final PR SHA
- CI status summary
- Mergeability / conflict status
- Fixes pushed
- Flaky retry cycles used
- Remaining unresolved failures or review comments
Rationalizations
| Rationalization | Reality |
|---|
| "CI is just being flaky, I'll just merge it" | Flakiness masks real bugs. Rerun or fix, never ignore. |
| "A human will review this anyway, I don't need to be perfect" | The sentinel's job is to save human reviewers time by catching low-hanging fruit. |
| "This PR is small, I can skip the full monitoring loop" | Small PRs can have big impacts. Consistency is key to reliability. |
Red Flags
References
references/heuristics.md - CI/Review decision tree for fix vs rerun vs stop
references/github-api-notes.md - GitHub CLI commands and endpoints used by the watcher