| name | babysit-pr |
| description | Babysit the GitHub PR associated with the current branch: check whether merging the latest base branch would conflict, resolve and commit merge conflicts with `commit-smart` when needed, then loop on `dd-gitlab/*` CI checks until they pass; when concrete dd-gitlab jobs fail, classify the fetched Mosaic traces, merge the latest base when failures look external, use `code-implement-loop` only for failures that are likely caused by the PR, update the PR body at the end, and run PR reviews. |
Babysit PR
Hard Rules
- First, review the
# Global Rules from your memory file and apply them before the skill-specific rules below.
- Infer the PR from the current branch with
gh, then validate that local state matches the inferred PR.
- Do not broaden scope beyond:
- merge-conflict remediation against the latest PR base branch
- fixing failing
dd-gitlab/* CI jobs
- updating the PR body at the end
- parallel Codex and Claude PR reviews after checks are green
- Treat
dd-gitlab/default-pipeline as a rollup check, not a concrete job trace source.
Workflow
0) Preflight
- Resolve repo scope and enforce the strict coding preflight:
eval "$("$HOME/dotfiles/scripts/coding-preflight.mjs")"
- The helper must provide:
inside_worktree, worktree_root, worktree_path, branch, repo, in_dd_scope, origin_branch_ref, origin_branch_exists, local_ahead_count, origin_ahead_count.
- If helper exits non-zero, stop and report blocked status with helper stderr.
cd "$worktree_root".
- Load the PR associated with the current branch:
if ! pr_meta_json="$(gh pr view --repo "$repo" "$branch" --json number,url,baseRefName,headRefName,headRefOid)"; then
echo "FAILED: current branch has no associated PR"
exit 1
fi
- Parse from
pr_meta_json:
pr_number
pr_url
base_ref
head_ref
head_sha
pr_number="$(jq -r '.number' <<<"$pr_meta_json")"
pr_url="$(jq -r '.url' <<<"$pr_meta_json")"
base_ref="$(jq -r '.baseRefName' <<<"$pr_meta_json")"
head_ref="$(jq -r '.headRefName' <<<"$pr_meta_json")"
head_sha="$(jq -r '.headRefOid' <<<"$pr_meta_json")"
If pr_url is empty or null, stop and return FAILED: current branch has no associated PR.
- Confirm the current checkout matches the inferred PR:
branch from the helper must equal head_ref
git rev-parse HEAD must equal head_sha
- if any check fails, stop and report the mismatch
1) Check whether merging the latest base branch would conflict
- Query GitHub for the PR mergeability state:
merge_state_json="$(gh pr view --repo "$repo" "$pr_url" --json mergeable,mergeStateStatus)"
mergeable="$(jq -r '.mergeable' <<<"$merge_state_json")"
merge_state_status="$(jq -r '.mergeStateStatus' <<<"$merge_state_json")"
- If
mergeable=="UNKNOWN", wait briefly and repoll a small number of times so GitHub can finish computing mergeability.
- Interpret the result from GitHub:
mergeable=="MERGEABLE": no conflict-driven merge is needed at this stage
mergeable=="CONFLICTING": the PR branch conflicts with the latest base branch
- any other value, or a persistent
UNKNOWN: stop and report mergeable and mergeStateStatus as blocked
2) If conflicts exist, merge latest base, resolve them, and commit
Only run this step when Step 1 found real merge conflicts.
- Refresh and merge the latest base branch into the PR branch:
git fetch origin "$base_ref"
git merge --no-ff "origin/$base_ref"
- Resolve conflicts with the smallest change that restores the intended PR behavior.
- Run the minimum targeted verification needed for the conflict resolution.
- Invoke
commit-smart immediately to create the merge commit and push it.
- After
commit-smart completes, continue into the CI loop below.
3) Loop on dd-gitlab/* checks until they all pass
Run the following loop until every dd-gitlab/* check has passed.
Each iteration includes these steps:
- Refresh the checks:
checks_json="$(gh pr checks --repo "$repo" "$pr_url" --json name,workflow,state,bucket,link)"
dd_gitlab_checks_json="$(
jq '
map(select(.name | startswith("dd-gitlab/")))
' <<<"$checks_json"
)"
-
Partition the dd-gitlab/* checks:
pending: bucket=="pending"
failed: bucket=="fail" or bucket=="cancel"
passed: bucket=="pass"
-
If there are zero dd-gitlab/* checks, treat that as "jobs not started yet" rather than success. Sleep for a fixed interval such as 60 seconds, then start the next loop iteration.
-
If any dd-gitlab/* checks are still pending, do not handle failures yet. Sleep for a fixed interval such as 60 seconds, then start the next loop iteration.
-
Once there are one or more dd-gitlab/* checks and zero pending dd-gitlab/* checks:
- if all
dd-gitlab/* checks passed, skip the remaining steps in this iteration and exit the loop
- otherwise continue with the failure-handling steps below
-
Split the failed dd-gitlab/* checks into:
fetchable_failed_jobs: failed checks whose link contains taskId=gitlab and taskExecutionId=
rollup_only_failures: failed checks such as dd-gitlab/default-pipeline whose link does not include a concrete taskExecutionId=
-
If there are no fetchable_failed_jobs, stop and report blocked status with the failing rollup checks. The skill cannot fetch logs for a rollup-only failure.
-
For each job in fetchable_failed_jobs:
- fetch the failure log with:
node "$HOME/dotfiles/scripts/fetch-mosaic-ci-log.mjs" "<mosaic-link>"
- treat the JSON returned by
fetch-mosaic-ci-log.mjs as the source of truth for:
web_url: the GitLab job URL
trace_file: the local path to the fetched trace file
- read
trace_file and extract:
- the failing Bazel target or job step
- the failing test name or command when present
- the concrete error text or exception
- a concise failure summary
- whether the failure is likely caused by this PR
- classify each job as either:
likely caused by this PR
likely not caused by this PR
- treat failures like the following as
likely not caused by this PR unless stronger evidence points to the patch:
- checkout/bootstrap failures before repo code executes
gitretriever fetch failed
- source fetch or checkout cleanup failures
- runner or CI environment bootstrap failures
- truncated logs with no concrete repo target, test, or command failure visible
-
If any job in fetchable_failed_jobs is classified as likely not caused by this PR, do not invoke code-implement-loop yet. Remediate against the freshest base branch first:
git fetch origin "$base_ref"
- if the fetch fails, stop and report blocked status
- if the current branch already contains the freshly fetched
origin/$base_ref, stop and report blocked status rather than retrying CI unchanged
- otherwise merge the freshly fetched base:
git merge --no-ff "origin/$base_ref"
- if the merge conflicts, resolve them with the smallest change that restores intended PR behavior
- run the minimum targeted verification needed for the merge or conflict resolution
- invoke
commit-smart immediately to create or push the merge result
- after
commit-smart completes, sleep for a fixed interval such as 60 seconds, then start the next loop iteration.
-
Hand off fetchable_failed_jobs that are still classified as likely caused by this PR to code-implement-loop. The handoff must include, for each such job:
- the PR URL
- the GitLab job URL from
web_url
- the local trace file path from
trace_file
- the failure summary extracted from the trace
- Invoke
code-implement-loop with that raw failure context as the entire implementation scope.
- If
code-implement-loop returns blocked status, propagate it and stop.
- If
code-implement-loop succeeds, continue the loop and return to Step 3.1 to repoll the dd-gitlab/* checks.
Example handoff to code-implement-loop:
Fix the failing dd-gitlab CI jobs for PR https://github.com/DataDog/dd-source/pull/406053 only.
- dd-gitlab/test-all:unit
PR: https://github.com/DataDog/dd-source/pull/406053
GitLab job: https://gitlab.ddbuild.io/DataDog/dd-source/-/jobs/1620901756
Trace file: /tmp/mosaic-ci-1620901756/job-1620901756.log
Summary: //domains/assistant/apps/apis/assistant_api:py_default_test failed because test_background_worker.py::test_run_command_agent_populates_background_worker_payload raised TypeError: object MagicMock can't be used in 'await' expression
4) Update PR body
After all dd-gitlab/* checks pass, review the entire PR change, not just commits or fixes made during this skill run:
gh pr view --repo "$repo" "$pr_url" --json title,body,commits,files
gh pr diff --repo "$repo" "$pr_url"
Only update the PR body when either:
- the existing PR body is empty
- the existing PR body starts with the hidden marker:
If the existing PR body is non-empty and does not start with this marker, treat it as manually edited and skip the PR body update.
Update the managed body by splicing new generated content into the existing body:
-
If the existing PR body is empty, initialize the base text to the hidden marker followed by a blank line.
-
Otherwise, keep the original body as the base text. Do not regenerate the whole body from scratch.
-
Locate level-2 section headings with lines that start with ## .
-
Generate new content only for the managed ## Problem and ## Approach sections, using the PR title, managed body, commit list, changed files, and full PR diff.
- Write for a reviewer who is deciding what to inspect first.
- Prefer concrete review areas over broad architecture phrasing.
- Do not compress multiple subsystems into one long sentence.
- Do not enumerate every touched file, test, or mechanical edit.
- If the PR spans multiple subsystems, use short bullets grouped by review boundary.
-
Upsert the ## Problem section:
- If a line exactly matching
## Problem exists, replace that full section. The section starts at ## Problem and ends immediately before the next ## heading, or at end of body.
- If it does not exist, create a new
## Problem section after the marker and any immediately following blank lines.
-
Upsert the ## Approach section:
- If a line exactly matching
## Approach exists, replace that full section. The section starts at ## Approach and ends immediately before the next ## heading, or at end of body.
- If it does not exist, create a new
## Approach section immediately after the ## Problem section.
-
The ## Problem section must be concise and reviewer-digestible:
## Problem
<why this change is needed>
Requirements:
- State the current limitation or missing capability in plain language.
- State the user-visible or reviewer-visible outcome this PR enables.
- Keep it to 1-2 short paragraphs or 2-3 bullets.
- Avoid umbrella phrases like "end-to-end path" unless the following text names the concrete boundaries.
-
The ## Approach section must be concise and reviewer-digestible:
## Approach
<key implementation choices>
Requirements:
- Organize by review boundary, not by commit order.
- For multi-subsystem PRs, prefer 3-5 bullets with bold labels.
- Each bullet should name what changed and why that boundary matters.
- Mention tests only when they clarify behavior coverage or reviewer risk.
- Keep implementation detail high-level enough that a reviewer can choose where to dive into the diff.
-
Leave every byte outside those two managed sections unchanged. Do not edit, reorder, remove, or regenerate any other section or content.
-
Then update the PR body with:
gh pr edit --repo "$repo" "$pr_url" --body-file "<body-file>"
Focus on the high-level problem and approach
- Skip mechanical details such as added unit tests, renamed variables, changed function arguments, or other implementation minutiae unless they are essential to understanding the design.
- The goal is to state the problem clearly and lay out the high-level approach so reviewers can review the PR efficiently.
- The output should help reviewers triage the diff. If the generated text reads like an abstract design summary, rewrite it around concrete review boundaries.
For a large background-worker PR, prefer this shape over a dense paragraph:
## Problem
Foreground assistant requests need to accept long-running work quickly without keeping the request open. Today there is no typed handoff to a background worker that can report progress and write the final result back to the conversation.
This PR adds that handoff, so reviewers should focus on the API-to-worker contract, callback validation, and persistence behavior.
## Approach
- **Worker contract:** Add typed start-task and worker-message callback payloads, including callback context and `PROGRESS`/`FINAL` event shapes.
- **Worker runtime:** Implement the assistant-agent-worker workflow, activity execution, demo agent, and callback client.
- **API integration:** Expose the start-background-task tool from assistant_api, start the worker workflow, and validate worker callbacks before persisting conversation updates.
- **Shared definitions and docs:** Keep shared agent definitions in the assistant library and document the V1 boundary in ADR-004.
5) Run dual PR review and update PR
Run the bundled helper:
review_result="$(
node "$HOME/dotfiles/claude-skills/babysit-pr/scripts/run_dual_pr_review.mjs" \
--worktree-root "$worktree_root" \
--repo "$repo" \
--pr-number "$pr_number" \
--pr-url "$pr_url" \
--base-ref "$base_ref"
)"
Parse review_result as JSON:
status: approved, revise, or error
reviewers: reviewer status map
review_file: local review artifact path, when available
review_comment: PR comment upsert result, when available
error: summary of review or comment publication errors, when present
Do not block on any Step 5 result, including status=="error" or invalid JSON. Carry the parsed result or raw helper output into the final status only.
6) Return final status
Use one of:
SUCCESS: dd-gitlab checks green, PR body updated, and review summary comment upserted | PR: <url>
SUCCESS: dd-gitlab checks green, PR body left unchanged because existing body is unmarked, and review summary comment upserted | PR: <url>
SUCCESS: dd-gitlab checks green and PR body step completed, but review summary comment was not upserted | PR: <url> | Warning: <exact review or upsert error summary>
BLOCKED: merge conflict check failed | PR: <url> | Error: <summary>
BLOCKED: rollup-only dd-gitlab failure without fetchable jobs | PR: <url>
BLOCKED: external-looking dd-gitlab failure but branch already includes latest base | PR: <url>
BLOCKED: code-implement-loop failed | PR: <url> | Error: <summary>
BLOCKED: <reason> | PR: <url>