ワンクリックで
integrate
Use when the user says "integrate", "review bot PRs", "merge bot fixes", "check bot work", "triage PRs", or explicitly "/integrate". Reviews Claude bot PRs, validates them with available test infrastructure, and merges or rejects.
メニュー
Use when the user says "integrate", "review bot PRs", "merge bot fixes", "check bot work", "triage PRs", or explicitly "/integrate". Reviews Claude bot PRs, validates them with available test infrastructure, and merges or rejects.
Use when the user says "delegate", "assign bot work", "dispatch issues", "triage open issues", "send to bot", or explicitly "/delegate". Scans open GitHub issues, classifies which are bot-delegatable, enriches them with design direction and context, and assigns via @claude comments. Analyzes prior bot failures and decomposes large issues.
Use when the user says "crystallize", "compact", "compile", "distill", or asks to reduce a skill's reliance on LLM prose reasoning. Phase-transitions deterministic operations out of skill docs into real scripts with real test data. Rewrites the SKILL.md to be shorter and call the scripts instead of describing the logic. Aggressive about the boundary: LLM does judgment, prose, ambiguity resolution. Scripts do math, time comparison, parsing, counting, aggregation, schema validation.
This skill should be used when the user asks to "test on device", "test on emulator", "run emulator", "launch AVD", "test PWA", "test on Android", "test on mobile", "verify on real device", "check on phone", or discusses testing a feature on an actual device or emulator rather than headless Playwright. Also use when validating features that headless browsers cannot cover (biometric, PWA install, Chrome autofill, touch gestures, password managers). Use proactively when a feature has been implemented that touches any of these capabilities.
Protocol for capturing a complete development arc (TRACE). Includes telemetry, strategy pivots, logs, and performance metadata. Use this to record the "how and why" of an objective, especially for performance-critical or non-deterministic tasks.
Use when the user says "cycle", "pump", "sdlc", "next cycle", or "/cycle". Runs one full SDLC cycle — discover, prioritize, cluster, delegate, develop, gate — filtered by theme.
Use when the user says "develop", "work on issue", "implement issue", "fix issue N", or explicitly "/develop N". Spawns a local develop agent in a worktree to implement one or more GitHub issues. Supports batch mode ("/develop 3,9,16") with max 4 parallel agents. Without arguments, auto-proposes bot-labeled issues ranked by risk and relevance.
| name | integrate |
| description | Use when the user says "integrate", "review bot PRs", "merge bot fixes", "check bot work", "triage PRs", or explicitly "/integrate". Reviews Claude bot PRs, validates them with available test infrastructure, and merges or rejects. |
Process reference:
.claude/process.mddefines the label taxonomy, workflow states, and conventions that this skill must follow.
Review, validate, and merge PRs created by the Claude bot from @claude issue tasks.
Bot PRs follow the branch pattern claude/issue-{N}-{DATE}-{TIME} or bot/issue-{N} (local develop agents).
Compile non-determinism into deterministic, declarative, reproducible scripts.
Every step in the integration pipeline runs through a script. No ad-hoc bash chains.
scripts/gh-ops.sh integrate PR ISSUE replaces gh pr merge && gh issue close && git pull.
scripts/container-ctl.sh ensure replaces docker compose build && docker compose up -d.
If you're composing commands with &&, you've already lost — script it.
The integration pipeline is packaged as scripts in scripts/:
| Script | Purpose |
|---|---|
integrate-discover.sh | List all bot branches, group by issue, count attempts, score risk. Outputs JSON. |
integrate-cleanup.sh | Delete branches for over-attempted issues, comment on GitHub issues. Reads discover JSON. |
integrate-gate.sh | Fast gate a single branch: calls test-typecheck, test-lint, test-unit. Stashes/restores local state. |
test-typecheck.sh | TypeScript type checking (tsc --noEmit). |
test-lint.sh | ESLint static analysis on all source directories. |
test-unit.sh | Vitest unit tests (src/**/*.test.ts). No browser, no Playwright. |
test-headless.sh | Headless Playwright tests (Pixel 7, iPhone 14, Desktop Chrome). Excludes Appium. |
run-appium-tests.sh | Full Appium emulator acceptance with screen recording and archival. |
run-emulator-tests.sh | Acceptance gate: boots emulator, starts server, runs Playwright emulator tests. |
server-ctl.sh | Server lifecycle: start/stop/restart/ensure. Used post-merge. |
Run foreground. The user wants to see triage decisions and approve merges. Report progress per-PR: what was checked, what passed/failed, merge or reject decision.
Use the Task tool to run scripts concurrently where steps are independent (e.g., fast-gating multiple low-risk branches in parallel). Present results to the user before proceeding to the next step.
scripts/integrate-discover.sh > /tmp/integrate-candidates.json
This outputs a JSON array with each entry scored by risk:
reject -- >2 attempts (know-when-to-quit rule)low -- single file, <50 linesmedium -- multi-file within one module, or <100 lines across <=3 fileshigh -- core code, >200 lines, server changes, vault/cryptoskip -- branch has no commits ahead of mainPresent the triage table to the user: issue number, title, attempt count, risk, diff size. Ask the user how to proceed (evaluate candidates, clean up first, etc.).
scripts/integrate-cleanup.sh --file /tmp/integrate-candidates.json
This deletes branches for all reject-risk issues and comments on the GitHub issues
explaining that the bot couldn't converge and human re-scoping is needed.
After cleanup, update labels per .claude/process.md:
scripts/gh-ops.sh labels N --rm bot --add divergence
Options:
--dry-run -- preview without acting--issue N -- clean up a specific issue's branches--all -- delete all bot branches (nuclear option)For each candidate branch (in risk order: low first, then medium, then high):
scripts/integrate-gate.sh <branch-name>
The gate runs 5 tiers:
ui.ts, index.html, app.css). Skipped for non-UI PRs.The script:
scripts/test-typecheck.sh, scripts/test-lint.sh, scripts/test-unit.shNote: fast gate does NOT run browser tests. Headless Playwright is Step 4.
Exit code 0 = all gates passed, 1 = gate failed, 2 = setup error.
To auto-close a PR on failure:
scripts/integrate-gate.sh <branch> --close-on-fail --pr <number>
Run multiple fast gates in parallel using the integrate-gater agent
(.claude/agents/integrate-gater.md). Always spawn with isolation: "worktree" so
each agent gets its own git worktree and can checkout branches without conflicting.
Start with 2 parallel agents and increase if the machine handles it well. The
original 8-agent crash was likely git lock contention (now solved by worktrees),
not pure CPU.
Example agent invocation:
Agent(subagent_type="general-purpose", isolation="worktree", model="sonnet", prompt="<integrate-gater prompt>", description="...")
Always use general-purpose — custom subagent_types are broken in file-based
discovery (see .claude/rules/agents.md). Read .claude/agents/integrate-gater.md
for the prompt content.
Between merges, run headless Playwright to catch regressions quickly:
scripts/test-headless.sh
This is a regression check, not final acceptance. The full emulator acceptance run happens once after all merges complete (Step 7).
device label check: If the issue has the device label (per .claude/process.md),
emulator or real-device validation is mandatory. Do NOT merge with headless-only results
unless the full Appium run in Step 7 will cover it.
The user tests on the production Docker container (mobissh-prod), not a local server.
After all merges complete, rebuild and restart the container:
scripts/container-ctl.sh restart
If the container is stale, the user sees old behavior and files false bugs.
force: true Playwright hacks, no inline
styles (prefer CSS), no --no-verify bypassesscripts/gh-ops.sh integrate <PR-N> <issue-N>
This single command: merges the PR, closes the issue, removes the bot label,
pulls main, and prunes stale refs. Worktree cleanup is deferred to release.
For orphaned branches (no PR), create a PR first, then integrate:
scripts/gh-ops.sh pr-create --head <branch> --title "<issue title>" --body "Bot fix for #<N>" --label bot
scripts/gh-ops.sh integrate <PR-N> <issue-N>
When a bot PR:
This is NOT a rejection. The feature is correct; the test harness is outdated. Action:
/delegate to post a test-fixup @claude comment on the same issue
(see the test-fixup template in the delegate skill)bot label -- do NOT swap to divergencescripts/test-headless.shKey distinction: outdated != flaky. Tests that fail because the UX intentionally changed need their assertions updated. Tests that fail intermittently need investigation.
2 prior bot attempts for same issue
scripts/gh-ops.sh pr-close <N> --comment "Closing: <clear reason with specific failure details>"
scripts/gh-ops.sh labels <issue-N> --rm bot --add divergence
For orphaned branches (no PR), just delete the branch and update labels:
scripts/integrate-cleanup.sh --issue <N>
scripts/gh-ops.sh labels <N> --rm bot --add divergence
After each successful merge:
git checkout main
git pull
scripts/test-headless.sh
Report: "Merged PR #N (). Headless tests: X pass."
After ALL merges complete, rebuild the production container:
scripts/container-ctl.sh restart
This is a regression gate between merges. The full acceptance run is Step 7.
After ALL merges are complete, run the full Appium test suite on the emulator. This produces a screen recording for human review of new features.
if [[ ! -e /dev/kvm ]]; then
echo "KVM not available -- emulator cannot run on this machine"
EMULATOR=false
elif ! command -v emulator &>/dev/null && ! command -v adb &>/dev/null; then
echo "Android SDK not installed -- running setup..."
scripts/setup-avd.sh
EMULATOR=true
else
EMULATOR=true
fi
if [ "$EMULATOR" = true ]; then
scripts/run-appium-tests.sh
fi
run-appium-tests.sh handles the full pipeline: server startup, Docker sshd,
emulator boot, ADB forwarding, Appium server, screen recording with debug overlays,
Playwright test execution, artifact collection, and archival to test-history/appium/.
After the run completes:
playwright-report-appium/test-results-appium/ for per-test artifactstest-history/appium/<ISO-8601-timestamp>/recording.webmscripts/review-recording.shIf the emulator is unavailable (no KVM, CI runner), report which PRs need emulator validation and skip this step. Do NOT silently omit it.
When processing multiple PRs:
scripts/run-appium-tests.sh (Step 7) for final acceptance with recordingCreate a TRACE when integrating multiple PRs or when a non-trivial merge/reject decision is made:
scripts/trace-init.sh "integrate-{date-or-theme}"
Record in strategy/initial_plan.md:
Log pivots when:
Populate TRACE.md on completion with:
TRACE is optional for single-PR integration with clear pass/fail. Create a TRACE when: batch integrating, making judgment calls on edge cases, or when a PR reveals a systemic issue.
These rules come from real project history. They are not suggestions.
Know when to quit: >2 bot attempts on the same issue means the issue needs human re-scoping, not more bot retries. Close the PR, comment on the issue with what the bot couldn't solve, and move on.
Mobile UX must be device-tested: touch, gesture, layout, keyboard features cannot be validated headless-only. If the emulator isn't available, flag the PR but do NOT merge.
Stale server trap: the user tests on a running server while code changes happen. Always restart the server and verify the version hash after merging. A stale server means the user sees old behavior and files false bugs.
No force hacks: if a Playwright test needs force: true or timeout: 30000 to pass,
the fix is wrong -- the underlying layout or timing issue needs to be addressed.
Selection overlay precedent: PR went through 6 commits, never worked on real Android, got feature-flagged off. Bot fixes that keep failing acceptance tests should be branched off rather than iterated on main.
No inline styles: prefer CSS classes. This is a project rule (CLAUDE.md).
Outdated != flaky: When a UX change causes headless test failures, those tests need
their assertions updated -- they aren't broken or intermittent. Use the test-fixup pass
(approve-with-test-fixup) to delegate this to the bot rather than rejecting the feature.
The bot can run scripts/test-headless.sh and fix mismatched selectors/assertions.
Two-pass delegation works: Feature pass (fast gate) -> human UX review -> test-fixup pass (full gate including headless). This prevents the bot from guessing UX decisions while still automating the mechanical test updates.
integrate-gate.sh auto-stashes and restoresrun-emulator-tests.shgh api which authenticates via gh tokengh-ops.sh pr-merge now prunes worktrees and
removes local branches before merging. Always run git worktree prune before merge steps..claude/worktrees/.
Always use absolute paths or explicit cd to the main repo before running scripts.