بنقرة واحدة
pr-address
// Address PR review comments and loop until CI green and all comments resolved. TRIGGER when user asks to address comments, fix PR feedback, respond to reviewers, or babysit/monitor a PR.
// Address PR review comments and loop until CI green and all comments resolved. TRIGGER when user asks to address comments, fix PR feedback, respond to reviewers, or babysit/monitor a PR.
Alternate /pr-review and /pr-address on a PR until the PR is truly mergeable — no new review findings, zero unresolved inline threads, zero unaddressed top-level reviews or issue comments, all CI checks green, and two consecutive quiet polls after CI settles. Use when the user wants a PR polished to merge-ready without setting a fixed number of rounds.
E2E manual testing of PRs/branches using docker compose, agent-browser, and API calls. TRIGGER when user asks to manually test a PR, test a feature end-to-end, or run integration tests against a running system.
Analyze the current branch diff against dev, plan integration tests for changed frontend pages/components, and write them. TRIGGER when user asks to write frontend tests, add test coverage, or 'write tests for my changes'.
Meta-agent supervisor that manages a fleet of Claude Code agents running in tmux windows. Auto-discovers spare worktrees, spawns agents, monitors state, kicks idle agents, approves safe confirmations, and recycles worktrees when done. TRIGGER when user asks to supervise agents, run parallel tasks, manage worktrees, check agent status, or orchestrate parallel work.
Open a pull request with proper PR template, test coverage, and review workflow. Guides agents through creating a PR that follows repo conventions, ensures existing behaviors aren't broken, covers new behaviors with tests, and handles review via bot when local testing isn't possible. TRIGGER when user asks to "open a PR", "create a PR", "make a PR", "submit a PR", "open pull request", "push and create PR", or any variation of opening/submitting a pull request.
Initialize a worktree-based repo layout for parallel development. Creates a main worktree, a reviews worktree for PR reviews, and N numbered work branches. Handles .env creation, dependency installation, and branchlet config. TRIGGER when user asks to set up the repo from scratch, initialize worktrees, bootstrap their dev environment, "setup repo", "setup worktrees", "initialize dev environment", "set up branches", or when a freshly cloned repo has no sibling worktrees.
| name | pr-address |
| description | Address PR review comments and loop until CI green and all comments resolved. TRIGGER when user asks to address comments, fix PR feedback, respond to reviewers, or babysit/monitor a PR. |
| user-invocable | true |
| argument-hint | [PR number or URL] — if omitted, finds PR for current branch. |
| metadata | {"author":"autogpt-team","version":"1.0.0"} |
gh pr list --head $(git branch --show-current) --repo Significant-Gravitas/AutoGPT
gh pr view {N}
Understand the Why / What / How before addressing comments — you need context to make good fixes:
gh pr view {N} --json body --jq '.body'
If GraphQL is rate-limited,
gh pr viewfails. See GitHub rate limits for REST fallbacks.
⚠️ WARNING — PAGINATE ALL PAGES BEFORE ADDRESSING ANYTHING
reviewThreads(first: 100)returns at most 100 threads per page AND returns threads oldest-first. On a PR with many review cycles (e.g. 373 threads), the oldest 100–200 threads are from past cycles and are all already resolved. Filtering client-side withselect(.isResolved == false)on page 1 therefore yields 0 results — even though pages 2–4 contain many unresolved threads from recent review cycles.This is the most common failure mode: agent fetches page 1, sees 0 unresolved after filtering, stops pagination, reports "done" — while hundreds of unresolved threads sit on later pages.
One observed PR had 142 total threads: page 1 returned 0 unresolved (all old/resolved), while pages 2–3 had 111 unresolved. Another with 373 threads across 4 pages also had page 1 entirely resolved.
The rule: ALWAYS paginate to
hasNextPage == falseregardless of the per-page unresolved count. Never stop early because a page returns 0 unresolved.
Step 1 — Fetch total count and sanity-check the newest threads:
# Get total count and the newest 100 threads (last: 100 returns newest-first)
gh api graphql -f query='
{
repository(owner: "Significant-Gravitas", name: "AutoGPT") {
pullRequest(number: {N}) {
reviewThreads { totalCount }
newest: reviewThreads(last: 100) {
nodes { isResolved }
}
}
}
}' | jq '{ total: .data.repository.pullRequest.reviewThreads.totalCount, newest_unresolved: [.data.repository.pullRequest.newest.nodes[] | select(.isResolved == false)] | length }'
If total > 100, you have multiple pages — you must paginate all of them regardless of what newest_unresolved shows. The last: 100 check is a sanity signal only; the full loop below is mandatory.
Step 2 — Collect all unresolved thread IDs across all pages:
# Accumulate all unresolved threads — loop until hasNextPage == false
CURSOR=""
ALL_THREADS="[]"
while true; do
AFTER=${CURSOR:+", after: \"$CURSOR\""}
PAGE=$(gh api graphql -f query="
{
repository(owner: \"Significant-Gravitas\", name: \"AutoGPT\") {
pullRequest(number: {N}) {
reviewThreads(first: 100${AFTER}) {
pageInfo { hasNextPage endCursor }
nodes {
id
isResolved
path
line
comments(last: 1) {
nodes { databaseId body author { login } }
}
}
}
}
}
}")
# Append unresolved nodes from this page
PAGE_THREADS=$(echo "$PAGE" | jq '[.data.repository.pullRequest.reviewThreads.nodes[] | select(.isResolved == false)]')
ALL_THREADS=$(echo "$ALL_THREADS $PAGE_THREADS" | jq -s 'add')
HAS_NEXT=$(echo "$PAGE" | jq -r '.data.repository.pullRequest.reviewThreads.pageInfo.hasNextPage')
CURSOR=$(echo "$PAGE" | jq -r '.data.repository.pullRequest.reviewThreads.pageInfo.endCursor')
[ "$HAS_NEXT" = "false" ] && break
done
# Reverse so newest threads (last pages) are addressed first — GitHub returns oldest-first
# and the most recent review cycle's comments are the ones blocking approval.
ALL_THREADS=$(echo "$ALL_THREADS" | jq 'reverse')
echo "Total unresolved threads: $(echo "$ALL_THREADS" | jq 'length')"
echo "$ALL_THREADS" | jq '[.[] | {id, path, line, body: .comments.nodes[0].body[:200]}]'
Step 3 — Address every thread in ALL_THREADS, then resolve.
Only after this loop completes (all pages fetched, count confirmed) should you begin making fixes.
Why reverse? GraphQL returns threads oldest-first and exposes no
orderByoption. A PR with 373 threads has ~4 pages; threads from the latest review cycle land on the last pages. Processing in reverse ensures the newest, most blocking comments are addressed first — the earlier pages mostly contain outdated threads from prior cycles.
Filter to unresolved threads only — skip any thread where isResolved: true. comments(last: 1) returns the most recent comment in the thread — act on that; it reflects the reviewer's final ask. Use the thread id (Relay global ID) to track threads across polls.
If GraphQL is rate-limited, see GitHub rate limits for the REST fallback (flat comment list — no thread grouping or
isResolved).
gh api repos/Significant-Gravitas/AutoGPT/pulls/{N}/reviews --paginate
Already REST — unaffected by GraphQL rate limits or outages. Continue polling reviews normally even when GraphQL is exhausted.
CRITICAL — always --paginate. Reviews default to 30 per page. PRs can have 80–170+ reviews (mostly empty resolution events). Without pagination you miss reviews past position 30 — including autogpt-reviewer's structured review which is typically posted after several CI runs and sits well beyond the first page.
Two things to extract:
CHANGES_REQUESTED or APPROVED reviews.Where each reviewer posts:
autogpt-reviewer — posts detailed structured reviews ("Blockers", "Should Fix", "Nice to Have") as top-level reviews. Not present on every PR. Address ALL items.sentry[bot] — posts bug predictions as inline threads. Fix real bugs, explain false positives.coderabbitai[bot] — posts summaries as top-level reviews AND actionable items as inline threads. Address actionable items.gh api repos/Significant-Gravitas/AutoGPT/issues/{N}/comments --paginate
Already REST — unaffected by GraphQL rate limits.
Mostly contains: bot summaries (coderabbitai[bot]), CI/conflict detection (github-actions[bot]), and author status updates. Scan for non-empty messages from non-bot human reviewers that aren't the PR author — those are the ones that need a response.
CRITICAL: The only valid sequence is fix → commit → push → reply → resolve. Never resolve a thread without a real code commit.
Resolving a thread via resolveReviewThread without an actual fix is the most common failure mode — it makes unresolved counts drop without any real change, producing a false "done" signal. If the issue was genuinely a false positive (no code change needed), reply explaining why and then resolve. Otherwise:
Address comments one at a time: fix → commit → push → inline reply → resolve.
Use a markdown commit link so GitHub renders it as a clickable reference. Always get the full SHA with git rev-parse HEAD after committing — never copy a SHA from a previous commit or hardcode one:
FULL_SHA=$(git rev-parse HEAD)
gh api repos/Significant-Gravitas/AutoGPT/pulls/{N}/comments/{ID}/replies \
-f body="🤖 Fixed in [${FULL_SHA:0:9}](https://github.com/Significant-Gravitas/AutoGPT/commit/${FULL_SHA}): <description>"
| Comment type | How to reply |
|---|---|
Inline review (pulls/{N}/comments) | gh api repos/Significant-Gravitas/AutoGPT/pulls/{N}/comments/{ID}/replies -f body="🤖 Fixed in [abc1234](https://github.com/Significant-Gravitas/AutoGPT/commit/FULL_SHA): <description>" |
Conversation (issues/{N}/comments) | gh api repos/Significant-Gravitas/AutoGPT/issues/{N}/comments -f body="🤖 Fixed in [abc1234](https://github.com/Significant-Gravitas/AutoGPT/commit/FULL_SHA): <description>" |
Only two situations justify calling resolveReviewThread:
sdk_cwd is pre-validated by _make_sdk_cwd() which applies normpath + prefix assertion before reaching this point").Anti-patterns that look resolved but aren't — never do these:
"Accepted, tracked as follow-up" — a deferral, not a fix. The concern is still open. Do not resolve."Acknowledged" or "Same as above" — these are acknowledgements, not fixes. Do not resolve."Fixed in abc1234" where abc1234 is a commit that doesn't actually change the flagged line/logic — dishonest. Verify git show abc1234 -- path/to/file changes the right thing before posting.When in doubt: if a code change is needed, make it. A deferred issue means the thread stays open until the follow-up PR is merged.
Codecov patch target is 80% on changed lines. Checks are informational (not blocking) but should be green.
Backend (from autogpt_platform/backend/):
poetry run pytest -s -vv --cov=backend --cov-branch --cov-report term-missing
Frontend (from autogpt_platform/frontend/):
pnpm vitest run --coverage
git diff --name-only $(gh pr view --json baseRefName --jq '.baseRefName')...HEADhelpers.ts/helpers.py and test those (highest ROI). Colocate tests as *_test.py (backend) or __tests__/*.test.ts (frontend).After fixing, format the changed code:
autogpt_platform/backend/): poetry run formatautogpt_platform/frontend/): pnpm format && pnpm lint && pnpm typesIf API routes changed, regenerate the frontend client:
cd autogpt_platform/backend && poetry run rest &
REST_PID=$!
trap "kill $REST_PID 2>/dev/null" EXIT
WAIT=0; until curl -sf http://localhost:8006/health > /dev/null 2>&1; do sleep 1; WAIT=$((WAIT+1)); [ $WAIT -ge 60 ] && echo "Timed out" && exit 1; done
cd ../frontend && pnpm generate:api:force
kill $REST_PID 2>/dev/null; trap - EXIT
Never manually edit files in src/app/api/__generated__/.
Then commit and push immediately — never batch commits without pushing. Each fix should be visible on GitHub right away so CI can start and reviewers can see progress.
Never push empty commits (git commit --allow-empty) to re-trigger CI or bot checks. When a check fails, investigate the root cause (unchecked PR checklist, unaddressed review comments, code issues) and fix those directly. Empty commits add noise to git history.
For backend commits in worktrees: poetry run git commit (pre-commit hooks).
Codecov enforces patch coverage on new/changed lines — new code you write must be tested. Before pushing, verify you haven't left new lines uncovered:
cd autogpt_platform/backend
poetry run pytest --cov=. --cov-report=term-missing {path/to/changed/module}
Look for lines marked miss — those are uncovered. Add tests for any new code you wrote as part of addressing comments.
Rules:
address comments → format → commit → push
→ wait for CI (while addressing new comments) → fix failures → push
→ re-check comments after CI settles
→ repeat until: all comments addressed AND CI green AND no new comments arriving
After pushing, poll for both CI status and new comments in a single loop. Do not use gh pr checks --watch — it blocks the tool and prevents reacting to new comments while CI is running.
Note:
gh pr checks --watch --fail-fastis tempting but it blocks the entire Bash tool call, meaning the agent cannot check for or address new comments until CI fully completes. Always poll manually instead.
Polling loop — repeat every 30 seconds:
gh pr checks {N} --repo Significant-Gravitas/AutoGPT --json bucket,name,link
Parse the results: if every check has bucket of "pass" or "skipping", CI is green. If any has "fail", CI has failed. Otherwise CI is still pending.
gh pr view {N} --repo Significant-Gravitas/AutoGPT --json mergeable --jq '.mergeable'
If the result is "CONFLICTING", the PR has a merge conflict — see "Resolving merge conflicts" below. If "UNKNOWN", GitHub is still computing mergeability — wait and re-check next poll.
Check for new/changed comments (all three sources):
Inline threads — re-run the GraphQL query from "Fetch comments". For each unresolved thread, record {thread_id, last_comment_databaseId} as your baseline. On each poll, action is needed if:
id appears that wasn't in the baseline (new thread), ORlast_comment_databaseId has changed (new reply on existing thread)Conversation comments:
gh api repos/Significant-Gravitas/AutoGPT/issues/{N}/comments --paginate
Compare total count and newest id against baseline. Filter to non-empty, non-bot, non-author-update messages.
Top-level reviews:
gh api repos/Significant-Gravitas/AutoGPT/pulls/{N}/reviews --paginate
Watch for new non-empty reviews (CHANGES_REQUESTED or COMMENTED with body). Compare total count and newest id against baseline.
React in this precedence order (first match wins):
| What happened | Action |
|---|---|
| Merge conflict detected | See "Resolving merge conflicts" below. |
Mergeability is UNKNOWN | GitHub is still computing mergeability. Sleep 30 seconds, then restart polling from the top. |
| New comments detected | Address them (fix → commit → push → reply). After pushing, re-fetch all comments to update your baseline, then restart this polling loop from the top (new commits invalidate CI status). |
| CI failed (bucket == "fail") | Get failed check links: gh pr checks {N} --repo Significant-Gravitas/AutoGPT --json bucket,link --jq '.[] | select(.bucket == "fail") | .link'. Extract run ID from link (format: .../actions/runs/<run-id>/job/...), read logs with gh run view <run-id> --repo Significant-Gravitas/AutoGPT --log-failed. Fix → commit → push → restart polling. |
| CI green + no new comments | Do not exit immediately. Bots (coderabbitai, sentry) often post reviews shortly after CI settles. Continue polling for 2 more cycles (60s) after CI goes green. Only exit after 2 consecutive green+quiet polls. |
| CI pending + no new comments | Sleep 30 seconds, then poll again. |
The loop ends when: CI fully green + all comments addressed + 2 consecutive polls with no new comments after CI settled.
gh pr view {N} --repo Significant-Gravitas/AutoGPT --json baseRefName --jq '.baseRefName'
git remote -v # find the remote pointing to Significant-Gravitas/AutoGPT (typically 'upstream' in forks, 'origin' for direct contributors)
git pull {base-remote} {base-branch} --no-rebase
if grep -R -n -E '^(<<<<<<<|=======|>>>>>>>)' <conflicted-files>; then
echo "Unresolved conflict markers found — resolve before proceeding."
exit 1
fi
git add <conflicted-files>
git commit -m "Resolve merge conflicts with {base-branch}"
git push
Three distinct rate limits exist — they have different causes, error shapes, and recovery times:
| Error | HTTP code | Cause | Recovery |
|---|---|---|---|
{"code":"abuse"} | 403 | Secondary rate limit — too many write operations (comments, mutations) in a short window | Wait 2–3 minutes. 60s is often not enough. |
{"message":"API rate limit exceeded"} | 429 | Primary REST rate limit — 5000 calls/hr per user | Wait until X-RateLimit-Reset header timestamp |
GraphQL: API rate limit already exceeded for user ID ... | 403 on stderr, gh exits 1 | GraphQL-specific per-user limit — distinct from REST's 5000/hr and from the abuse secondary limit. Trips faster than REST because point costs per query. | Wait until the GraphQL window resets (typically ~1 hour from the first call in the window). REST still works — use fallbacks below. |
Prevention: Add sleep 3 between individual thread reply API calls. When posting >20 replies, increase to sleep 5.
The gh CLI surfaces the GraphQL limit on stderr with the exact string GraphQL: API rate limit already exceeded for user ID <id> and exits 1 — any gh api graphql ... or gh pr view ... call fails. Check current quota and reset time via the REST endpoint that reports GraphQL quota (this call is REST and still works whether GraphQL is rate-limited OR fully down):
gh api rate_limit --jq '.resources.graphql' # { "limit": 5000, "used": 5000, "remaining": 0, "reset": 1729...}
# Human-readable reset:
gh api rate_limit --jq '.resources.graphql.reset' | xargs -I{} date -r {}
Retry when remaining > 0. If you need to proceed sooner, sleep 2–5 min and probe again — the limit is per user, not per machine, so other concurrent agents under the same token also consume it.
When GraphQL is unavailable (rate-limited or outage):
gh pr checks), and the gh api rate_limit probe./pulls/{N}/comments REST, which drops thread grouping, isResolved, and Relay thread IDs. You still get comment bodies and the databaseId as id, enough to read and reply.gh pr view, the resolveReviewThread mutation, and any new gh api graphql queries — wait for the quota to reset.PR metadata reads — gh pr view uses GraphQL under the hood; use the REST pulls endpoint instead, which returns the full PR object:
gh api repos/Significant-Gravitas/AutoGPT/pulls/{N} --jq '.body' # == --json body
gh api repos/Significant-Gravitas/AutoGPT/pulls/{N} --jq '.base.ref' # == --json baseRefName
gh api repos/Significant-Gravitas/AutoGPT/pulls/{N} --jq '.mergeable' # == --json mergeable
Note: REST mergeable returns true|false|null; GraphQL returns MERGEABLE|CONFLICTING|UNKNOWN. The null case maps to UNKNOWN — treat it the same (still computing; poll again).
Inline comments (flat list) — no thread grouping or isResolved, but enough to read and reply:
gh api repos/Significant-Gravitas/AutoGPT/pulls/{N}/comments --paginate \
| jq '[.[] | {id, path, line, user: .user.login, body: .body[:200], in_reply_to_id}]'
Use this degraded mode to make progress on the fix → reply loop, then return to GraphQL for resolveReviewThread once the rate limit resets.
Replies — already REST-native (/pulls/{N}/comments/{ID}/replies); no change needed, use the same command as the main flow.
resolveReviewThread — no REST equivalent; GitHub does not expose a REST endpoint for thread resolution. Queue the thread IDs needing resolution, wait for the GraphQL limit to reset, then run the resolve mutations in a batch (with sleep 3 between calls, per the secondary-limit guidance).
sleep 3 between each callNever batch all replies in a tight loop — always space them out.
When a PR has more than 10 unresolved threads, addressing one commit per thread is slow. Use this strategy instead:
ALL_THREADS by path — threads in the same file can share a single commit.git commit → git push → reply to all those threads with the same SHA → resolve them all.This reduces N commits to (number of files touched), which is usually 3–5 instead of 15–30.
For truly independent thread groups (different files, no shared logic), you can post replies in parallel using background subshells — but always space out API writes:
# Post replies to a batch of threads concurrently, 3s apart
(
sleep 3
gh api repos/Significant-Gravitas/AutoGPT/pulls/{N}/comments/{ID1}/replies \
-f body="🤖 Fixed in [${FULL_SHA:0:9}](https://github.com/Significant-Gravitas/AutoGPT/commit/${FULL_SHA}): ..."
) &
(
sleep 6
gh api repos/Significant-Gravitas/AutoGPT/pulls/{N}/comments/{ID2}/replies \
-f body="🤖 Fixed in [${FULL_SHA:0:9}](https://github.com/Significant-Gravitas/AutoGPT/commit/${FULL_SHA}): ..."
) &
wait # wait for all background replies before resolving
Then resolve sequentially (GraphQL mutations):
for THREAD_ID in "$THREAD1" "$THREAD2" "$THREAD3"; do
gh api graphql -f query="mutation { resolveReviewThread(input: {threadId: \"${THREAD_ID}\"}) { thread { isResolved } } }"
sleep 3
done
Always sleep 3s between individual API writes — GitHub's secondary rate limit (403) triggers on bursts of >20 writes. Increase to sleep 5 when posting more than 20 replies in a batch.
Use resolveReviewThread only after the commit is pushed and the reply is posted:
gh api graphql -f query='mutation { resolveReviewThread(input: {threadId: "THREAD_ID"}) { thread { isResolved } } }'
Never call this mutation before committing the fix. The orchestrator will verify actual unresolved counts via GraphQL after you output ORCHESTRATOR:DONE — false resolutions will be caught and you will be re-briefed.
resolveReviewThreadis GraphQL-only — no REST equivalent. If GraphQL is rate-limited, see GitHub rate limits for the queue-and-retry flow.
Before claiming "0 unresolved threads", always query GitHub directly — don't rely on your own bookkeeping. Paginate all pages — a single first: 100 query misses threads beyond page 1:
# Step 1: get total thread count
gh api graphql -f query='
{
repository(owner: "Significant-Gravitas", name: "AutoGPT") {
pullRequest(number: {N}) {
reviewThreads { totalCount }
}
}
}' | jq '.data.repository.pullRequest.reviewThreads.totalCount'
# Step 2: paginate all pages, count truly unresolved
CURSOR=""; UNRESOLVED=0
while true; do
AFTER=${CURSOR:+", after: \"$CURSOR\""}
PAGE=$(gh api graphql -f query="
{
repository(owner: \"Significant-Gravitas\", name: \"AutoGPT\") {
pullRequest(number: {N}) {
reviewThreads(first: 100${AFTER}) {
pageInfo { hasNextPage endCursor }
nodes { isResolved }
}
}
}
}")
UNRESOLVED=$(( UNRESOLVED + $(echo "$PAGE" | jq '[.data.repository.pullRequest.reviewThreads.nodes[] | select(.isResolved==false)] | length') ))
HAS_NEXT=$(echo "$PAGE" | jq -r '.data.repository.pullRequest.reviewThreads.pageInfo.hasNextPage')
CURSOR=$(echo "$PAGE" | jq -r '.data.repository.pullRequest.reviewThreads.pageInfo.endCursor')
[ "$HAS_NEXT" = "false" ] && break
done
echo "Unresolved threads: $UNRESOLVED"
Only output ORCHESTRATOR:DONE after this loop reports 0.