with one click
babysit-pr
Monitor a PR, fix feedback and CI failures until fully green for 30 min. Run with /babysit-pr <number>
Menu
Monitor a PR, fix feedback and CI failures until fully green for 30 min. Run with /babysit-pr <number>
Use Agent-Native Plans for UI-first planning with an optional top pan/zoom wireframe canvas, a refined Notion-like document, rich tabs, diagrams, comments, drawing, and agent handoff.
Use Agent-Native Plans when coding-agent work needs an interactive structured plan document with inline diagrams, implementation maps, optional UI/product wireframes or prototypes, annotations, and comments.
Use Agent-Native Plans to turn a code change, PR diff, or git diff into a visual recap plan for high-altitude review — schema, API, file, and before/after changes as grounded structured blocks instead of a wall of diff.
Creating, editing, and managing extensions — sandboxed Alpine.js mini-apps that run inside iframes. Use when a user asks for a dashboard, widget, calculator, or any interactive mini-app that calls external APIs. Distinct from LLM "tools" (function calls) — see note below.
Use Agent-Native Plans to ask rich visual intake questions when /visual-questions is explicitly requested before creating a UI plan or visual plan.
Query Prometheus-compatible metrics endpoints (self-hosted, Grafana Cloud Prom, AMP, etc.) via the prometheus action and as a dashboard panel source.
| name | babysit-pr |
| description | Monitor a PR, fix feedback and CI failures until fully green for 30 min. Run with /babysit-pr <number> |
| user-invocable | true |
| metadata | {"internal":true} |
Monitor PR #$ARGUMENTS in the current repo. Fix CI failures and human or bot review feedback until everything is green and no new feedback arrives for 30 minutes.
If no PR number is given, auto-detect it: get the current branch (git branch --show-current), find the open PR for it (gh pr list --head <branch> --state open --json number --limit 1). If no open PR exists, check recent merged/closed PRs. Only ask the user if no PR can be found.
ScheduleWakeup before yielding — pass this same /babysit-pr <number> … invocation back as the wake-up prompt so the next firing repeats the tick. The loop ends only at a stop condition (below); until then there is always a scheduled next tick.pnpm run prep), you may rely on its completion notification but always also schedule a fallback ScheduleWakeup — notifications can silently fail to fire, and an unguarded wait becomes an indefinite stall. The loop must keep ticking regardless.pnpm run prep / vitest can hang or take minutes, and on a branch with concurrent edits a full local run is contaminated by other agents' in-flight files anyway. If local validation is slow, hung, or unreliable, push and let the CI you are already monitoring be the validation gate — a red CI job is caught and fixed on the very next tick. Prefer pushing your work over holding it for a clean local run.Step 0 — always do this first, before anything else:
git status --short to check for local uncommitted changes from concurrent agents.git diff --stat to understand what changed, then write a descriptive commit message based on the actual changes (e.g. "feat(tools): add error toast + dark mode sync" or "fix(analytics): update sidebar layout"). Never use generic messages like "chore: sweep concurrent agent changes".git add <files> && git commit -m "<descriptive message>" && git push.git log --oneline origin/<branch>..HEAD to check for local commits not yet on the remote.git push.This ensures every tick starts with a clean, fully-pushed working tree. Never skip this step.
Never git stash concurrent changes. Stashes get orphaned, and a stash named babysit-tickN-concurrent-work-* left on the source branch while babysit-pr's PR ships without it is exactly how real work has been lost (2026-05-05: stash@{0} held a Sentry-instrumentation feature for clips, including a new analytics.ts module, that was meant to merge with PR #511's followup but stayed stuck in the stash list because babysit stashed instead of committing). If you see local changes you don't recognize, that's still other agents' work — commit it with a descriptive message based on the diff, don't hide it in a stash.
Step 1 — check for merge conflicts:
gh pr view $ARGUMENTS --json mergeable --jq '.mergeable'.CONFLICTING: bring main in and resolve. Commit/push any local changes first (Step 0) so the tree is clean, then prefer a merge over a rebase — git fetch origin main && git merge --no-edit origin/main — because this branch is shared with concurrent agents and a rebase would rewrite history and require a force-push that can clobber their unpushed commits. Resolve the conflicts (for pnpm-lock.yaml, take one side with git checkout --theirs -- pnpm-lock.yaml then regenerate with pnpm install --lockfile-only against the merged package.json), git add the resolved files, complete the merge commit, and push (a normal push, never --force). This resets the soak timer. Only rebase if the user explicitly asks for a linear history.MERGEABLE or UNKNOWN: proceed. (mergeStateStatus: BLOCKED with mergeable: MERGEABLE just means required checks are still pending/red — that is not a conflict; keep going.)Then proceed with PR checks:
Check for review comments and review summaries from humans and bots — EVERY tick, with no exceptions.
⚠️ Review bots (Builder, Copilot, etc.) RE-REVIEW on every push and post a brand-new round of comments each time. A PR commonly accumulates several rounds. You MUST re-check on every single tick — including "quiet" ticks where you're only waiting on CI — and you must keep checking right up until the moment you merge.
Never filter comments by a "since " window. A forward-looking timestamp silently skips rounds that were posted before your last reply (e.g. a round that landed between the first review and when you replied), and "0 new since X" reads as "all addressed" when it is not. This exact mistake left two whole review rounds unanswered on PR #1097 (2026-06-08).
Instead, determine coverage by reply state: list every top-level review comment that does not yet have a reply, across all pages and all rounds. Stream every comment with --jq '.[]' (concatenates cleanly across pages), then slurp:
gh api --paginate repos/{owner}/{repo}/pulls/$ARGUMENTS/comments --jq '.[]' \
| jq -s '
([ .[] | .in_reply_to_id // empty ]) as $replied
| .[]
| select((.in_reply_to_id // null) == null) # top-level comments only
| select(.id as $id | ($replied | index($id)) | not) # …with no reply yet
| {id, user: .user.login, path, line: (.line // .original_line), snippet: (.body[0:200])}'
(Bind the id with .id as $id first — index(.id) would evaluate .id against the $replied array, not the comment, and error out.) If that command prints anything, there is unaddressed feedback — fix or reply to each (see "Responding to feedback") before you consider the PR clean. Also re-read the latest review summary bodies each tick (bots restate their findings here):
gh api repos/{owner}/{repo}/pulls/$ARGUMENTS/reviews --jq '.[] | select(.body != null and .body != "") | {user: .user.login, state, submitted_at, body: .body[0:1000]}'
Treat the count of unaddressed comments (not a timestamp) as the source of truth for "is there feedback to handle".
Check CI status:
gh pr checks $ARGUMENTS
If new human or bot feedback includes real bugs or requested changes:
pnpm run prep to verify locallyIf GitHub Actions CI is failing (lint, test, typecheck, build):
pnpm run prep locallySpecial case: missing changeset. If the failing job is Require changeset for publishable package changes (from .github/workflows/changeset-check.yml), do NOT treat it as a code bug. The job log includes a structured line MISSING_CHANGESET_PACKAGES: pkg1,pkg2. Parse that, then write a .changeset/<short-slug>.md directly — do NOT run the interactive pnpm changeset add. Use the PR title and diff to decide bump type (default to patch for bugfixes / docs / refactors; minor for additive features; major only when the PR description clearly signals breaking). Shape:
---
"@agent-native/<pkg-1>": patch
"@agent-native/<pkg-2>": patch
---
<one-line summary derived from the PR title>
Slug example: dispatch-route-shells.md (kebab-case, descriptive, ~3 words). Commit with chore: add changeset for <packages>, push, reset the timer. The check will pass on the next CI run.
If only external CI fails (Cloudflare Workers, Netlify, etc.) and GitHub Actions passes:
If everything green + no new feedback for 30 min: cancel the loop, report done
Every human or bot comment must get a reply — either a fix or an explanation of why you're skipping it.
gh api repos/{owner}/{repo}/pulls/$ARGUMENTS/comments/{id}/replies -f body="..." explaining why (pre-existing, false positive, not practical, etc.)Skip (with a reply explaining why) issues that are:
Fix issues that are:
Never auto-merge by default. Only merge when the user explicitly asks you to.
When the user does ask to merge, all of these must be true simultaneously for 10 consecutive minutes before merging:
git status --short must be emptygit log --oneline origin/<branch>..HEAD must be emptygh pr view --json mergeable --jq '.mergeable' must be MERGEABLEThe 10-minute soak timer resets to zero whenever you push anything, CI fails, a new review comment arrives, merge conflicts appear, or local changes are found and committed.
Only after 10 consecutive clean minutes, force merge with gh pr merge <number> --squash --admin.
Before stopping OR merging, the unaddressed-comments command above must print nothing — re-run it as the final gate. "I replied earlier" is not sufficient; bots may have posted new rounds since.