| name | ship-issue |
| version | 1.0.0 |
| type | skill |
| description | Drive an issue from triage to merged PR with minimal operator intervention. Runs the full loop — read & label, post agent brief, write a failing test, implement minimally, self-review via subagent, replicate local CI, push, watch CI, squash-merge, clean up, and report with next-step suggestions. Use when the operator says "ship #X", "do issue #X end-to-end", "take #X to merge", "fix and ship #X", "do #X autonomously", or any phrasing that authorizes carrying a single issue all the way through. Also use when the operator follows a /triage call with "do as you'd like", "update the issue then start implementation", or otherwise hands off the rest of the workflow. Do NOT use for purely advisory questions ("what should we do about #X?", "look at #X with me") — those stay in regular triage / discussion mode. |
| targets | ["claude-code","apm","codex","gemini","copilot","pi"] |
| category | {"primary":"workflow","secondary":["integrations","backpressure"]} |
Ship Issue
End-to-end issue-to-merge workflow. Designed to run with one operator authorization upfront — the operator says "ship #X" (or finishes /triage with "do as you'd like, then merge"), and you carry it through.
Why this skill exists
Each step of the workflow is straightforward in isolation, but stitching them together correctly takes attention: red-before-green TDD, replicating CI locally before pushing, getting an independent review before opening the PR (not after), distinguishing transient CI failures from real ones, cleaning up after merge, and ending with momentum-preserving next-step suggestions instead of a lifeless "done." This skill captures that rhythm so it happens reliably without the operator having to prompt each transition.
When the operator wants this
Trigger phrases that mean "go all the way":
- "ship #173", "do issue #173 end-to-end", "take #173 to merge"
- "do as you'd like" (after /triage)
- "update the issue, implement, review, PR, monitor CI, merge, report"
- "fix and ship #X autonomously"
- "do #X" when the issue context is already fully scoped
Phrases that mean don't ship — stay advisory:
- "look at #X with me", "what should we do about #X", "explore #X"
- "triage #X" alone (use the triage skill)
- "draft a fix for #X" without authorization to merge
When in doubt, ask one question: "Authorize the full ship — including merge — or stop after PR for your review?"
The workflow
0. Preflight
Before touching anything:
- Working tree is clean on the issue's target branch (usually
main). If dirty or on a feature branch, ask before proceeding.
- No in-flight branch named after this issue (
fix/<num>-* or feat/<num>-*). If one exists, ask whether to continue or restart.
- Project conventions loaded: read
CLAUDE.md (root + nested), .github/workflows/*.yml, Makefile, and any docs/architecture/ files relevant to the issue area. These tell you the commit style, branch naming, attribution rules, CI commands, and load-bearing design decisions. Do this once, upfront, instead of mid-flow.
- Issue tracker CLI auth works: a quick
gh auth status (or platform equivalent) check is fine; don't make it part of the user-facing narrative unless it fails.
1. Read & triage (skip if already triaged)
If the issue has no labels or is in needs-triage:
- Read the whole issue, including comments, in one call. With GitHub:
gh issue view <num> --json number,title,body,labels,comments,author,createdAt,updatedAt.
- Check the project's role-to-label mapping. Real label names may not match canonical role names (e.g. there may be no
needs-info label). Use what the project has and flag the gap rather than fabricating labels.
- Trace the bug to a root cause before writing the brief. For non-trivial issues, dispatch an exploration subagent with a concrete prompt — ranked hypotheses, file:line evidence, ≤250-word report. Don't do multi-file exploration inline.
- Apply labels: category (
bug / enhancement / etc.), state (ready-for-agent), priority, package, size estimate.
- Post the agent brief as an issue comment. Always lead with the project's required disclaimer if there is one (e.g.
> *This was generated by AI during triage.*).
The agent brief format that works well:
> *This was generated by AI during triage.*
## Agent Brief
### Problem
[1–2 sentences: observed symptom + production/user impact]
### Root cause
[file:line references with brief explanations of how the parts interact.
The "why" matters more than naming the lines — explain the failure mode.]
### Fix shape
[Smallest viable change. Mention alternatives if there's a real choice.
Be explicit about what NOT to touch — adjacent code that looks
related but isn't part of the bug.]
### Out of scope
[Sibling issues, tempting refactors, anything that should stay separate.]
### Tests (red-green TDD)
[Specific test names + what they prove. Note which existing tests
should keep passing as regression guards.]
### Acceptance
[Concrete CI commands + manual repro if applicable.]
### Constraints
[Project-specific: function-length limits, assertion counts,
no-new-deps rules, etc. — pulled from CLAUDE.md.]
The brief is a contract: future-you (or another agent) should be able to implement from it without re-investigating.
2. Branch
git checkout -b fix/<num>-<short-slug> (or feat/... for enhancements). The slug is 2–4 words from the issue title, kebab-case.
3. Red — write a failing test that proves the bug
This is the most important step. A passing test that "demonstrates" the bug isn't a regression test, it's just code.
- Write the test first.
- Run it. Confirm it fails on current
main before writing any production code.
- If the test passes unexpectedly, the bug isn't what you think it is — stop and investigate before writing a fix.
Watch for false negatives — tests that should fail but don't because some unrelated mechanism is masking the bug. Common culprits:
- Stream/queue dedup windows: if the test publishes events through a deduped pipeline, both publishes happen within the dedup window and the second one is silently rejected, hiding the bug. Subscribe at a layer below the dedup (core pubsub instead of a JetStream/Kafka consumer; raw HTTP instead of an idempotent client; etc.).
- Time-based behavior with synthetic time: production behavior may depend on real elapsed time (e.g., a 30 s gap exceeding a 5 s window). Use synthetic times via injected clocks, NOT
sleep — but make sure the synthetic times reproduce the production timing relationships.
- Predicate is pure but production is stateful: testing the predicate in isolation may not catch state-related bugs. Test through the call path that has state.
If the obvious test setup makes the bug invisible, the fix is to change the test (subscribe somewhere else, mock something differently), not to handwave.
4. Green — implement minimally
- Smallest change that turns the test green. No "while I'm here" cleanup. No new abstractions. No defensive code for impossible states.
- Follow the project's coding rules from CLAUDE.md (function-length limits, assertion-per-function minimums, naming conventions, comment policy).
- Comments explain WHY, not what.
- After the change, run the new tests + adjacent tests to confirm nothing else broke.
5. Self-review — independent subagent pass before the PR
Spawn a code-review subagent (e.g. pr-review-toolkit:code-reviewer, or whatever review agent the harness exposes) with a concrete brief: the bug background, the fix shape, what to look at, and specific questions you want answered. Phrase questions to surface same-class issues — e.g., "I fixed the live-tick path; did I miss the same root cause in the backfill path?"
If the reviewer finds a same-class hole, fix it in this same PR (defense in depth — don't ship a half-fix). If the reviewer flags out-of-scope concerns, note them as follow-up issues, don't expand this PR.
6. Replicate CI locally
Read .github/workflows/ci.yml (or equivalent) and run every check the CI runs, in the same order:
- Format check (
gofmt -l ., prettier --check, etc.)
- Vet / static analysis (
go vet ./..., staticcheck ./..., eslint, etc.)
- Tests with the same env vars and flags CI uses (timeouts, race detection, matrix)
If a step depends on infrastructure you can't reproduce locally (Docker, deployed services), say so explicitly when reporting status — don't silently skip.
The point: never push to a PR with the expectation of letting CI find your problems. Push only after you've already replicated CI green locally.
7. Commit
- Conventional Commits style if the project uses it (check recent
git log).
- Subject line: imperative, ≤72 chars, scoped — e.g.
fix(trigger): minute-precision dedup so 30s ticks don't double-fire (#173).
- Body explains the WHY, the failure mode, and the fix shape. Skip the WHAT — the diff says what.
- No AI attribution unless the project's CLAUDE.md says otherwise. No co-author trailer, no "generated by", no robot emoji.
- HEREDOC the message:
git commit -m "$(cat <<'EOF'
fix(scope): subject
Body explaining the why.
EOF
)"
8. Push & open PR
git push -u origin <branch>
gh pr create --title "..." --body "$(cat <<'EOF'
## Summary
- 1–3 bullets, what + why
## Test plan
- [ ] Specific test names that fail on main, pass on branch
- [ ] CI commands replicated locally (gofmt, vet, staticcheck, test, race)
- [ ] Full suite green
EOF
)"
Capture the PR number — every subsequent command needs it.
9. Watch CI
gh pr checks <num> --watch blocks until checks finish. While it runs, don't poll yourself — let the watch command do it.
When checks finish:
- All green → proceed to merge.
- Real failure → diagnose, fix on the branch, push, watch again.
- Transient failure (registry timeout, runner outage, infra hiccup): note it explicitly when reporting; re-run via
gh pr checks <num> --rerun-failed (or rerun the specific job from the UI) before treating it as a real failure.
Distinguishing transient vs real matters because pretending a transient is real wastes the operator's attention, and pretending a real failure is transient ships bugs.
10. Merge & clean up
gh pr merge <num> --squash --delete-branch
git checkout main
git pull --ff-only
git branch -D <branch> 2>/dev/null || true
git fetch --prune
If the PR didn't auto-close the issue (no Closes #N keyword in the PR body, or the issue tracker doesn't auto-link), close it explicitly:
gh issue close <num> --reason completed --comment "$(cat <<'EOF'
> *This was generated by AI during triage.*
Resolved by #<pr-num> (squash-merged into `main` as `<short-sha>`).
EOF
)"
11. Report — with next steps
The report is the end of this shipment but the start of the operator's next move. Format:
- What landed (≤3 bullets: the change, in their language).
- Verification (local CI green, remote CI passed in Xs, race clean).
- Production action for the operator (concrete: redeploy from main, run a migration, restart a service, verify behavior on a specific box). If the issue mentioned production impact, this is mandatory.
- What's next (1–2 sentences pointing at the most logical next issue or follow-up — the operator shouldn't have to ask "what now?").
A flat "merged ✅" is a failure mode of this skill. The operator is mid-flow and needs the next pointer.
Heuristics & gotchas
Decide unilaterally; surface decisions
Throughout the workflow, dozens of small choices appear: branch name, test name, commit subject, slug for the PR. Make the call yourself, state it in one sentence, continue. Don't pile up multiple-choice questions for the operator. Reserve actual questions for taste / architecture / ethics / reversibility (see the user's CLAUDE.md "Question discipline" section if present).
Don't fabricate labels or roles
If the issue tracker doesn't have a needs-info or ready-for-human label and the triage skill expects it, say so once and use the available labels. Don't silently invent.
Subagent for exploration, inline for judgment
Anything that requires reading ≥3 files or searching across the repo: dispatch an exploration agent with a tight ≤250-word brief. Anything that requires deciding between alternatives: do it inline, with the operator able to see the reasoning.
Code review subagent comes BEFORE the PR
Common mistake: open the PR, then ask for a review. By then you've already published a review request to the operator's inbox and shaped their expectations. Get the independent review before push, fold its findings into the same commit, and the PR is born clean.
CI parity matters
If the project pins a Go version, a Node version, or a specific test timeout, replicate that locally. Running the wrong version's tests and declaring "local CI green" is a lie that surfaces 2 minutes later when remote CI fails.
No --no-verify, no --force
Pre-commit hooks and CI exist for a reason. If a hook fails, fix the underlying issue. If you need to amend, prefer a new commit (the user's CLAUDE.md may forbid amending — check).
Cleanup is part of the job
A merged PR with a stale local branch, an unpruned remote ref, and an unclosed issue is a paper cut for the operator. The cleanup steps in §10 aren't optional.
Authorization model
Default: shipping to merge requires explicit authorization. Ways the operator can grant it:
- Direct: "ship #X", "do as you'd like, merge when ready", "take #X to merge"
- Inferred from chained instructions: "update the issue, implement, review, PR, monitor CI, merge, report"
- Pre-existing standing order in CLAUDE.md or memory ("for issues in repo X, run ship-issue without confirming")
Without one of those, default to stopping at PR-open and asking the operator to merge themselves. The user's CLAUDE.md may say "never push directly to main" — that doesn't conflict with this skill (we always go through a PR), but it does mean if the operator hasn't authorized merge, you stop at PR-open.
What this skill is NOT
- Not for multi-issue batches. One issue at a time. If multiple are linked, ship them serially or merge their scope first.
- Not for issues that need real product/architecture decisions. Those need conversation, not autonomous shipping.
- Not for anything touching production data, customer-visible state, or shared infrastructure without an explicit operator confirmation step folded in.
- Not a replacement for the triage skill — it composes with it. If the operator runs
/triage first and fleshes out a brief, this skill picks up from there.
Pairing
| With | When |
|---|
/triage first | If the issue is unlabeled or in needs-triage, run triage first; then ship-issue picks up at step 2. |
--accessory pr-policy | The pr-policy accessory enforces "no direct pushes to main" and "local CI before declaring ready" — both already core to this skill, but the accessory pins them as a global rule for the session. |
ci-watch | Auto-loaded on phrases like "watch CI" — composes naturally with step 9. |
requesting-code-review | Step 5 is essentially a structured invocation of this — load it explicitly if you want the harness's review-request idioms enforced. |