with one click
squidsquad
// Orchestrates a multi-agent AI development team — handles setup, workflow coordination, role management, and autonomous dev cycles.
// Orchestrates a multi-agent AI development team — handles setup, workflow coordination, role management, and autonomous dev cycles.
| name | squidsquad |
| description | Orchestrates a multi-agent AI development team — handles setup, workflow coordination, role management, and autonomous dev cycles. |
| version | 0.36.0 |
| license | AGPL-3.0 |
You are activating the SquidSquad multi-agent development coordination system. SquidSquad spins up Claude Code CLI instances — one per dev role you define, plus PM, QA, and DM (all mandatory) — that work autonomously on a shared codebase by coordinating through GitHub Issues and a shared .squidsquad/ folder.
No meetings. No message queues. Just git and GitHub Issues.
graph TD
subgraph repo["Git Repository"]
subgraph agents["Claude Code Agents"]
RL["[Role] Lead"]
QA["QA"]
PM["PM"]
DM["DM"]
end
subgraph squid[".squidsquad/"]
config["config.md"]
role_dir["[role]/ — one per dev agent\nCLAUDE.md, working-state.md\nplanning/, iterations/"]
pm_dir["pm/\nCLAUDE.md, working-state.md\niterations/"]
qa_dir["qa/ — QA agent"]
dm_dir["dm/ — Delivery Manager"]
vault["vault/ — shared knowledge layer"]
end
GH["GitHub Issues\nBugs & features with labels"]
end
RL --> squid
QA --> squid
PM --> squid
DM --> squid
RL --> GH
PM --> GH
QA --> GH
DM --> GH
SquidSquad always has PM, QA, and DM agents. Dev agents are flexible — you define them at setup time.
| Agent | Owns | Loop |
|---|---|---|
| [role] Lead (one per dev role) | Code for that role, bugs and features via GitHub Issues | Ralph Loop (fix bugs → implement features → test → push) |
| QA | Test results, qa/qa-log.md, bug verification, feature testing | Ralph Loop (E2E tests → verify bugs → test features → health checks → push) |
| PM | Product backlog, human interaction, feature intake, backlog management | Ralph Loop (check human → feature intake → backlog management → push) |
| DM | Delivery packaging, docs, CHANGELOG, version bumps, git tags | Ralph Loop (scan pending-ship → deliver docs → version bump → push) |
Common team shapes:
| Shape | Dev agents | Use when |
|---|---|---|
fe, be | FE Lead + BE Lead + QA + PM | Full-stack app with separate frontend and backend |
be | BE Lead + QA + PM | API-only, CLI tool, library, or skill repo |
api, worker | API Lead + Worker Lead + QA + PM | Backend split across services |
web, ios, api | Web + iOS + API + QA + PM | Multi-platform product |
| (any names) | Whatever you define + QA + PM | Custom team topology |
When you invoke SquidSquad, it creates the following inside your project root. One folder is generated per dev agent — the example below shows a be-only setup:
.squidsquad/
├── config.md ← project config, test commands, counters, git protocol
├── [role]/ ← one folder per dev agent, named after the role (e.g. skill/, be/, fe/)
│ ├── CLAUDE.md ← full agent instructions (generated by compose.py deploy from sub-skills)
│ ├── SOUL.md ← agent personality (read at session start)
│ ├── working-state.md ← crash recovery state
│ ├── planning/ ← feature planning artifacts (research, context, test plans)
│ └── iterations/ ← iter-N.md logs per cycle
├── dm/ ← Delivery Manager
│ ├── CLAUDE.md ← full agent instructions (generated by compose.py deploy)
│ ├── SOUL.md ← agent personality (read at session start)
│ ├── working-state.md ← crash recovery state
│ └── iterations/ ← iter-N.md logs per cycle
├── pm/ ← Product Manager (human-facing coordinator)
│ ├── CLAUDE.md ← full agent instructions (generated by compose.py deploy)
│ ├── SOUL.md ← agent personality (read at session start)
│ ├── enhancements.md ← product backlog / enhancement proposals
│ ├── iterations/ ← iter-N.md logs per cycle
│ └── migrations/ ← historical migration logs (markdown tracker era)
├── qa/ ← QA (same structure as [role]/, plus qa-log.md)
└── vault/ ← shared memory layer (all agents R/W)
├── BRIEFING.md ← daily context briefing read by all agents at boot
├── projects/ ← active project context, goals, constraints
├── areas/ ← ongoing concerns: conventions, preferences, values
├── resources/ ← reference material, external docs
├── archives/ ← shipped features, closed decisions, historical context
└── galaxy/ ← atomic knowledge notes (decisions, patterns, learnings)
Note: All agents share a single GitHub Issues tracker on the repo. Issues are labeled by role, type, and status. QA queries for
Pending Testitems, verifies them, and comments. DM queries forPending Shipitems and handles delivery.
For fe, be the structure gains a fe/ folder alongside be/.
SquidSquad uses GitHub Issues as its tracker. All bugs and features are GitHub Issues with structured labels. Agents use gh CLI to create, query, update, and comment.
| Category | Labels | Purpose |
|---|---|---|
| Type | type:issue, type:task | What kind of item |
| Priority | priority:high, priority:medium, priority:low | Triage ordering |
| Status | status:open, status:pending, status:planning, status:planned, status:approved, status:in-progress, status:pending-test, status:pending-human-review, status:pending-human-setup, status:pending-ship, status:shipped | Workflow state |
| Role | role:skill, role:fe, role:be, role:pm, role:qa, role:dm | Which agent owns it |
| Severity (bugs) | severity:high, severity:medium, severity:low | Bug impact |
| Special | squidsquad, improvement-scan | Metadata |
Status labels: status:open → status:in-progress → status:pending-test → status:pending-ship → (Issue closed)
Status labels: status:pending → status:planning → status:planned → status:approved → status:in-progress → status:pending-test → status:pending-ship → (Issue closed)
Agents can also transition to status:pending-human-review (HITL design review or PR review gate) or status:pending-human-setup (blocked on human environment/tool setup). Both route back to status:in-progress once resolved.
Note:
pending= awaiting human approval to plan.planning= PM running Feature Intake (Research → Discussion → Planning).planned= planning complete, awaiting human approval for execution.approved= human greenlit, dev picks it up.pending-human-review= awaiting human review (design iteration or PR).pending-human-setup= paused for human to complete tool/environment setup.pending-ship= QA verified, DM handles delivery. Closed = shipped.
Discussion entries are Issue comments using the same timestamped, role-signed format:
> [2026-01-15 09:00] **pm**: Proposed for this sprint.
> [2026-01-15 09:30] **fe-lead**: Picking this up. Status → In Progress.
Features go through a deep, research-driven lifecycle before reaching the dev agent:
FEAT-XXX-RESEARCH.mdFEAT-XXX-CONTEXT.mdFEAT-XXX-TEST-PLAN.mdPlanning files live in .squidsquad/[role]/planning/ and are auto-deleted after ship (git preserves them). Bugs are excluded — they use the current lightweight flow. Trivial/cosmetic features can use light mode (PM skips research).
Each agent runs its own Ralph Loop — an autonomous work cycle that repeats on an interval. On startup, agents invoke /loop [INTERVAL]m execute one Ralph Loop cycle to schedule repeating cycles. The /loop command handles timing and re-invocation reliably — agents do NOT manually sleep or self-loop. Each cycle prints visible start/stop markers with timestamps (e.g. [🦑] ---- cycle 3 started at 14:32:07 ----) so the human can spot cycle boundaries in terminal scrollback.
Every step within the loop also prints a [🦑 HH:MM:SS] timestamped marker (e.g. [🦑 HH:MM:SS] Pulling latest..., [🦑 HH:MM:SS] Triaging bugs...). Key sub-actions (filing bugs, verifying fixes, committing) get their own markers too. This makes SquidSquad activity easy to scan in scrollback.
Iteration log retention: each agent keeps the last 20 iteration files in its iterations/ directory. After logging a new iteration, older files beyond this limit are deleted. Git history preserves them if ever needed.
All agents maintain a working state file (.squidsquad/[role]/working-state.md) that tracks the current task, completed steps, and remaining work. This file is read on startup to resume mid-task after a context window reset. Agents also check context pressure at the start of each cycle — if context_window.used_percentage exceeds the threshold in config.md (default 70%), they save state, commit, and continue (Claude Code compresses prior messages automatically). All agents write their context pressure percentage to .squidsquad/[role]/context-pressure each cycle.
Agent health monitoring: The harness monitors agent liveness via direct PID checks (primary) and .health files (fallback). .health files (.squidsquad/[role]/.health) track each agent's lifecycle with values: booting (startup), epoch timestamp (heartbeat — written every 5s while alive), dead (on exit), or error|reason (pre-flight failure, e.g. error|gh auth failed). An agent is considered alive if its heartbeat is less than 10s old. Pre-flight checks (gh auth, correct branch) run before launching — failures prevent crash loops. The harness automatically reboots dead agents when their intent is running, with crash recovery via .harness-state.json.
Auto versioning: DM tracks a Shipped Since Last Bump counter in config.md. Each time an item is marked Shipped, the counter increments. When the counter reaches Ship Threshold (default 10) AND zero open issues exist, DM automatically bumps the minor version (e.g. 0.27.0 → 0.28.0), updates config.md and SKILL.md frontmatter, adds a CHANGELOG section, creates a git tag, and pushes. Version bumps bypass PR flow.
Comprehension testing: Verify that agents correctly understand their instructions with automated spec-driven tests. Write a JSON spec file (tests/comprehension/<issue>_spec.json) listing the files to read and questions with expected behavioral answers. The pipeline (references/scripts/run_comprehension_test.py) spawns a test agent that reads only the listed files and answers the questions, then an eval agent that grades answers against expected behavior, producing a results.json. A deterministic pytest wrapper asserts all questions pass — no QA agent in the loop. Add new tests by dropping a spec file; run with python tests/run_tests.py.
Multi-model subagents: Token-heavy tasks (research, discussion prep, test plan drafting, improvement scanning) can be routed to external models via API instead of spawning Claude subagents. Configure which model handles each task type in the Model Routing section of config.md (e.g. Research Model: gpt-5.2, Comprehension Model: claude). The router (references/scripts/model_router.py) provides a model-agnostic interface with automatic fallback to Claude if the external model fails. Comprehension testing is always Claude-only for same-model fidelity. API keys are stored in ~/.squidsquad/secrets with restricted file permissions and read automatically by the model router.
Each dev agent follows this loop, substituting its own role name and tracker paths:
1. git pull
1b. Context pressure check — if above threshold, checkpoint state and continue (Claude Code compresses prior messages automatically)
1c. Resume from working-state.md if active task exists
2. Query GitHub Issues via `gh issue list` with label filters (`role:[role]`, `type:issue`, `status:open` or `status:in-progress`)
→ Write working state, fix issue, clear state on completion
→ If issue touches another agent's domain, file a new Issue with the appropriate `role:` label
→ Update issue status labels, append Discussion as Issue comment
3. Query GitHub Issues via `gh issue list` with label filters (`role:[role]`, `type:task`, `status:approved`)
→ Write working state, implement task, update state as sub-steps complete
→ Update status labels to In Progress, then Pending Test
→ Clear working state on completion, append Discussion as Issue comment
3b. Before marking Pending Test: run automated code review against changed files (configurable model in `config.md` under `Code Review Model`). Findings are dispositioned (fix, file to PM, or justified-ignore) and posted as PR comments. Design-level flaws send the task back to planning automatically.
4. Run [role] test command (from config.md)
5. If quiet cycle (no issues fixed, no tasks progressed):
→ If Improvement Scanning enabled and quiet cycle counter ≥ 3: scan target project for domain-specific improvements (max 2 per scan). Classify each finding as an **issue** (broken, wrong, or stale behavior — e.g. dead code, incorrect docs, failing edge cases) or **task** (new or enhanced capability — e.g. missing test coverage, performance optimization, UX improvement). File issues directly as GitHub Issues with `type:issue` + `status:open`; file tasks through PM with `type:task` + `status:pending` for human approval
→ Otherwise: skip log/commit, go to sleep
6. Log iteration to [role]/iterations/iter-N.md
7. cycle_post.py handles commit, push, and iteration logging
8. /loop handles re-invocation on interval — no manual sleep
1. git pull
1b. Context pressure check — if above threshold, checkpoint state and continue (Claude Code compresses prior messages automatically)
1c. Resume from working-state.md if active task exists
1d. Task pickup — check for approved tasks assigned to PM, pick up and execute if found
2. Non-blocking human check-in (print note, continue immediately)
→ If human has provided input: file bugs to tracker; for features, discuss first (predict intent, surface questions, invite refinement), then file and run Feature Intake Process
→ Await human approval before marking features Approved (approval only offered after planning completes)
3. Backlog management — priority changes, feature status updates
3b. If GitHub Issues ingestion enabled: `gh issue list` → ingest new issues into trackers
4. If quiet cycle (no human input, no intake work): skip log/commit, go to sleep
5. Log iteration to pm/iterations/iter-N.md
6. cycle_post.py handles commit, push, and iteration logging
7. /loop handles re-invocation on interval — no manual sleep
1. git pull
1b. Context pressure check — if above threshold, checkpoint state and continue (Claude Code compresses prior messages automatically)
1c. Resume from working-state.md if active task exists
1d. Task pickup — check for approved tasks assigned to QA, pick up and execute if found
2. Run full e2e test command (from config.md)
3. Log results to qa/qa-log.md
4. If tests fail: file bug as GitHub Issue with appropriate `role:` label via `gh issue create`
5. Query GitHub Issues via `gh issue list` with label filters for `status:pending-test` items → verify → update to Pending Ship (DM handles delivery → Shipped)
5b. If PR Flow enabled: monitor open PRs, sync comments/merges/changes to Issues
6. Query GitHub Issues via `gh issue list` with label filters for `type:issue` + `status:in-progress` items marked as fixed → verify → close Issue
7. Agent health check: read `.health` files for liveness, `current-state` for phase detail, flag stalled/idle agents
8. If quiet cycle (no issues found, no verifications): skip log/commit
9. Log iteration to qa/iterations/iter-N.md
10. cycle_post.py handles commit, push, and iteration logging
11. /loop handles re-invocation on interval — no manual sleep
All agents follow these rules to minimize merge conflicts on shared tracker files:
git pull before starting any work.gh issue edit --add-label / --remove-label). Closed issues are terminal.skill: fix bug, fe: add button, pm: verify features). This prefix is used by the status line and PM health checks to detect agent activity via git log --grep.squid-squad) to avoid cross-agent conflicts. Tracker data lives in GitHub Issues and is conflict-free.When PR Flow: yes is set in config.md, dev agents create PRs instead of pushing directly to main:
Branch Pattern field in config.md. Default: squidsquad/task/{number} (e.g. squidsquad/task/67) — all roles share one branch per task, enabling single-PR holistic review of planning + code together. Projects can also use role-scoped patterns like squidsquad/{role}/{number} for per-role branches.task-begin / task-end) — agents don't manage branches manually.Pending Test, create a branch, push it, and open a draft PR via gh pr create --draft. QA converts it to ready (gh pr ready) after verification passes. This prevents premature merges before QA sign-off.gh pr list. For each PR:
Pending Ship (DM handles delivery and ships)In Progress and append the feedback to DiscussionPR Flow: no (default), agents push directly to main as before.When Auto Merge: yes and Branch Workflow: yes are set in config.md, the harness automatically merges PRs for verified tasks — you don't need to merge them manually. Agents request merges via POST /merge on the harness API; the harness executes the merge, and if the PR touched template files (references/), automatically recomposes agent templates and reboots only the affected agents.
merge:manual label to any task to skip auto-merge and require human review.references/, the harness runs compose.py deploy-all and reboots agents whose CLAUDE.md or SOUL.md changed. This eliminates missed recompositions.Auto Merge: no (default for new installs), all PRs require manual merge as before.When this skill is invoked (via npx squidsquad or /squidsquad-setup), the
install wizard walks the user through an interactive setup. The canonical,
step-by-step runbook the installer agent follows is:
references/wizard/WIZARD.md
That file covers every step (0 / 0b / 1..7), the intent classifier prompt, the manifest-driven setup_requirements walker, the review screen, the installer agent lifecycle, and error recovery. Do NOT reimplement the setup flow inline here — the runbook is the single source of truth. SKILL.md describes SquidSquad's architecture; WIZARD.md describes how to install it.
The install wizard uses these mechanical helpers:
references/scripts/wizard.py — gh prerequisite check, re-run detection,
repo metadata, project-name validation, config.md writer, filesystem
scaffolder, label management and migrationreferences/scripts/manifest.py — role / capability / preset manifest registry:
loading, validation, pipeline resolutionreferences/scripts/compose.py deploy <role> — per-role CLAUDE.md
composition (role template + shared sub-skills + placeholder substitution)All three expose JSON-output CLI commands so the prose runbook can call them and act on structured results without parsing free text.
The taxonomy the installer uses is not a hardcoded table. It lives in:
references/roles/<role>/ — per-role directory containing manifest.yaml,
SOUL.md, and instructions.md (the entry file for composition)references/sub-skills/capabilities/<capability>/ — per-capability registry
with manifest.yaml, setup.md (infrastructure walkthrough), and
sub-skill.md (agent-facing usage composed into consuming role CLAUDE.md
at runtime)references/presets/<preset>/ — preset manifests declaring
role_install_order (PM, QA, and DM are implicit and always installed)Adding a new role, capability, or preset is a pure data change: drop in a new directory, run the validator, re-deploy. No wizard code change required.
When the user invokes upgrade (via /squidsquad-upgrade or "upgrade squidsquad"), follow this orchestrator-driven flow. No parallel subagents — compose.py handles all template regeneration.
Read .squidsquad/config.md to get installed SquidSquad Version.
Read SKILL.md frontmatter for current version.
If both match: tell the user they're up to date and stop.
Read .squidsquad/.install-spec.json for the install-time configuration (agent list, preset, flags). If the file does not exist (pre-install-spec installations), derive the agent list from the Dev Agents field in config.md and proceed — the upgrade flow works without it.
Rebuild all agent CLAUDE.md files from the current sub-skill sources using the compose skill:
/squidsquad-compose
Or directly via CLI: python references/scripts/compose.py deploy-all
This regenerates .squidsquad/[role]/CLAUDE.md for every configured agent (dev agents from config + PM + DM if present). Placeholder substitution ([ROLE], [INTERVAL], [ROLE_TEST_CMD], etc.) is handled automatically by compose.py.
SOUL.md preservation: compose.py deploy_role never overwrites an existing SOUL.md — it only writes SOUL.md when the file is missing. User customizations to SOUL.md are preserved across upgrades.
Vault preservation: The upgrade does not touch .squidsquad/vault/. All vault content is preserved.
If Architecture Version in config.md is 1 (or absent), add any missing v2 sections with defaults. Do not delete existing v1 sections — agents still read them via config.py.
Check for and add these sections if missing (with defaults):
## Preset — Id: software-dev## Tools — (none)## Loop — Interval Minutes: [existing interval value], Context Threshold: [existing threshold value]## Flags — Diagnostics: yes, Improvement Scan: [existing value], PR Flow: [existing value], Vault Remember: [existing value]## Git Branches — Working Branch: [existing value or main], State Branch: [existing value or squid-squad]## Forge Backend — Provider: github, Endpoint: https://api.github.com## Model Routing — carry over existing values or use defaults (Default Model: claude, etc.)After patching, set Architecture Version to 2.
python references/scripts/wizard.py ensure-labels
This is idempotent — it creates any missing GitHub Issue labels and skips existing ones.
Update SquidSquad Version in .squidsquad/config.md to the current skill version.
git add .squidsquad/ .claude/
git commit -m "squidsquad: upgrade to [VERSION]"
git push
Clone isolation note: Agents running in sibling clones will receive the updated CLAUDE.md on their next git pull (which happens at the start of each cycle via cycle_pre.py). The upgrade only writes to the primary repo — agents are not restarted automatically.
Tell the user: version upgraded from → to, templates regenerated via compose.py, config schema version, any new sections added, label sync result, any failures.
Note: SquidSquad now uses GitHub Issues as its tracker (since v0.10.0). The markdown tracker schemas below are historical — no new schema migrations are needed. Existing markdown tracker files in
bugs/,features/, andarchived/are legacy artifacts.
All bugs and features are tracked as GitHub Issues with structured labels. Agents use python references/scripts/tracker.py for all tracker operations. Status transitions are enforced by the script. See the Tracker Protocol sub-skill for details.
Replaced monolithic bugs.md and features.md with individual files in bugs/ and features/ directories.
Added Pending Ship status to the feature lifecycle for DM delivery.
Original monolithic bugs.md and features.md files with inline fields.
/squidsquad-issue — Report a SquidSquad IssueWhen the user says /squidsquad-issue (or "report an issue", "squidsquad issue"), collect an issue report and file it to the upstream SquidSquad repo. Works from any agent session.
Instructions:
python references/scripts/diagnostics.py reportgh issue create -R WallyDoodlez/SquidSquad --title "[Bug]: <user's title>" --body "<report content>"
gh auth fails for the upstream repo, generate a pre-filled URL:
https://github.com/WallyDoodlez/SquidSquad/issues/new?title=[Bug]:+<title>&body=<url-encoded-body>
Print the URL for the user to open in their browser.Privacy: No code, secrets, or project file paths in the report. Config snapshot is auto-sanitized. User previews everything before filing.
/squidsquad-status — Squad Overview CommandWhen the user says /squidsquad-status (or "squad status", "show me the squad", etc.), generate a quick dashboard of the entire SquidSquad team. This works from any Claude session in the repo — not just the PM agent.
Instructions:
.squidsquad/config.md to get the list of dev agents and the loop interval..squidsquad/[role]/.health if it exists. Values are booting, dead, error|reason, or an epoch timestamp (heartbeat — if less than 10s old, agent is alive; if stale, agent is dead). The harness writes the epoch every 5 seconds while the agent runs. Fall back to git log --oneline --since="[2×interval] minutes ago" --grep="^[agent]:" if no .health file — if commits found, show as active; if prior commits exist but none recent, show as stalled; else unknown.git log --oneline --grep="^[agent]:" -1 --format="%ar"python references/scripts/tracker.py:
python references/scripts/tracker.py list-issues [role] — count and list open issuespython references/scripts/tracker.py list-tasks [role] --status approved — count and list approved taskspython references/scripts/tracker.py list-tasks [role] --status in-progress — count and list in-progress tasksgh issue list --label "squidsquad" --state closed --limit 5 --json number,title🦑 SquidSquad Status — [project name]
══════════════════════════════════════
Agent Health Last Commit
───── ────── ───────────
skill active 2 minutes ago
pm active 3 minutes ago
dm active 5 minutes ago
Backlog
───────
skill: 2 open bugs (#93, #95), 1 approved feat (#67)
Recently Shipped
────────────────
1. #66 — Deterministic script layer
2. #29 — Agent name aliases
3. ...
/squidsquad-interval — Change Loop Interval On The FlyWhen the user says /squidsquad-interval <Nm> (e.g. /squidsquad-interval 10m), change the Ralph Loop interval for all agents without restarting.
Instructions:
m (e.g. 5m, 10m, 15m). The m suffix is optional — bare integers are accepted (e.g. 10 is treated as 10m)./squidsquad-interval <Nm> (e.g. /squidsquad-interval 10m). Minimum 5 minutes..squidsquad/config.md under Iteration Interval > Minutes.Minutes value with the new interval.CronDelete with the existing cron job ID.CronCreate with */N * * * * (or appropriate cron expression for larger intervals), the same prompt (execute one Ralph Loop cycle), and recurring: true.Interval changed from [old]m to [new]m. All agents will pick up the change on their next cycle.Other agents detect the change automatically: each agent re-reads config.md at the start of every cycle (Step 1d — Interval Sync) and re-schedules its cron if the interval has changed.
[HINT] Download the complete skill directory including SKILL.md and all related files