| name | skill-parallel-orchestration |
| description | Use when decomposing tasks into parallel sub-tasks or spawning sub-agents. Vendor-agnostic core; load a per-vendor reference for concrete tool names, directory conventions, and invocation syntax. |
| tier | 2 |
| version | 3.7 |
Parallel Orchestration Skill
Purpose: vendor-agnostic protocol for the Orchestrator Role to decompose large tasks into independent units and execute them via parallel sub-agent spawning. Specific tool names, directory layouts, and invocation syntax are delegated to per-vendor reference files in references/.
1. Load the right reference (mandatory step — Read tool, now)
Before applying any protocol below, use the Read tool to load the matching reference file now. Do not proceed to §2 with only this SKILL.md in context — the reference supplies the concrete tool names and invocation syntax that the universal concepts below need to become executable.
1.1 Detection
Walk upward from the current working directory toward the filesystem root (or stop at a .git boundary if that's closer) and check for vendor markers:
| Runtime indicator (first match wins) | Read | Status |
|---|
CLAUDE.md + .claude/agents/ present | references/claude-code.md | Reference implementation (complete, smoke-tested) |
.codex/agents/ directory present | references/codex-cli.md | Scaffold — primitives documented from primary docs (parallel ✅ confirmed); not yet e2e-validated |
GEMINI.md present, no .claude/agents/ | references/gemini-cli.md | Scaffold — subagent format documented; ⚠️ Layer-A (parallel) unconfirmed in primary docs |
.cursor/ directory present | references/cursor.md | Scaffold — primitives documented from primary docs (parallel ✅ max-10); not yet e2e-validated |
.antigravity/ directory present (provisional — ⚠️ ambiguous: Antigravity shares AGENTS.md w/ Codex + ~/.gemini/ w/ Gemini) | references/antigravity.md | Scaffold — primitives documented (parallel ✅ async); dynamic-first (static custom-agent wrappers scaffolded); not e2e-validated |
| None of the above, or vendor has no parallel-spawn primitive | references/sequential-fallback.md | Universal-by-design; unvalidated on non-Claude runtimes (see file for caveats) |
Scaffold status (Tasks 080–081, 2026-06-10): the Codex / Cursor / Antigravity / Gemini references + critic wrappers were authored from vendor docs but not yet validated on real runtimes — each carries a ⚠️ banner and graduates to ✅ only after one operator-run /vdd-multi --no-fix on the actual CLI. Scaffold critic wrappers are generated from one manifest by scripts/generate_wrappers.py (item 6e) — edit scripts/wrappers_manifest.json, never the generated wrappers; Claude Code stays the hand-maintained reference. AGENTS.md alone is not a Codex/Antigravity marker (cross-vendor) — Codex keys on .codex/agents/, Antigravity on a provisional .antigravity/ (ambiguous); tie-break via §1.2. First-match-wins keeps Claude Code precedence in this (Claude) repo.
If cwd is not the project root, walk up looking for the first marker; stop at .git/ or filesystem root. If no marker found, the skill is being invoked outside a framework-managed project — emit a warning to the caller rather than silently falling back, then load sequential-fallback.md.
1.2 Tie-break when multiple indicators match
If a repo carries both CLAUDE.md and GEMINI.md (multi-vendor support), the agent cannot reliably introspect which CLI is hosting it. Use these concrete signals, in order:
- Tool-list fingerprint: if the agent has
Agent (with team_name parameter) + TeamCreate + SendMessage available — load claude-code.md. If it has a Gemini-specific run_shell_command or Cursor's Composer primitives — load the matching reference. Tool availability is the most reliable signal.
- Explicit caller hint: if the orchestrator passed a
runtime: parameter in the skill invocation, honor it.
- Fallback: if still ambiguous, emit a warning "ambiguous runtime; defaulting to sequential-fallback" and load
sequential-fallback.md. Do not guess silently.
2. Universal concepts
2.1 Roles
- Orchestrator: single lead agent that decomposes the task, invokes the parallel-spawn primitive, and merges results. Does not execute domain work itself.
- Teammate: independent worker with isolated context and an explicit artifact contract. Returns a structured report to the orchestrator; does not write to shared files unless the contract says so.
2.2 Two layers
- Layer A — Parallel independent spawn (universal). N teammates working on orthogonal pieces. No mid-work inter-teammate communication; merge happens after all return. Covers parallel critique, parallel exploration, independent atomic tasks.
- Layer B — Peer communication (vendor-dependent). Teammates message each other during work. Required iff teammate A's output depends on inspecting teammate B's in-progress state. Examples: security-vs-performance trade-off debate; frontend/backend API-schema negotiation mid-flight. Not all vendors support this natively — see your reference file.
Decision criterion for Layer A vs B: use Layer B iff teammates must exchange messages during their work (not just in post-hoc merge). Otherwise Layer A.
2.3 Three-phase protocol
- Decompose: split the task into independent units with clear artifact contracts. No shared mutable state. No ordering constraints beyond "all-done → merge". Each unit should fit a single teammate's context budget.
- Spawn: invoke all teammates in a single atomic step using the vendor's parallel-spawn primitive (see your reference file for syntax). Sequential invocations defeat the purpose.
- Merge: collect structured reports → deduplicate by location (±3 lines) → tag same-mechanism agreement
corroborated, escalate only different-mechanism overlap (§6 rule 3) → drop low-severity noise from bikeshedding-only teammates → emit unified artifact.
3. Red Flags (anti-rationalization — universal)
- "Sequential for independent tasks saves complexity." → WRONG. Slower, and you lose per-teammate context isolation. Use the parallel primitive when the runtime supports it.
- "Cross-pollinate critics' outputs to save tokens." → WRONG. Defeats parallel critique — each teammate's independent perspective is the whole point. Merge strictly after all return.
- "One big combined agent call is simpler." → WRONG. Separate teammates get separate context windows, stricter tool restrictions, and clearer failure modes. Collapsing them erases those properties.
- "Parallelism is a quality tool." → WRONG. Parallelism is a scalability tool. More agents ≠ better analysis. Default to 1; fan out only when objectively orthogonal subsystems are identified. See §5.
4. Best Practices (universal)
| DO | DO NOT |
|---|
| Single-invocation parallel spawn | Sequential invocations for independent work |
| Reference an existing teammate definition (by name/type) | Inline a full system prompt when a wrapper exists |
| Clear structured-return contract per teammate | Expect unstructured prose for post-hoc parsing |
| Merge in the orchestrator after all returns | Stream partial outputs between teammates (use Layer B if you genuinely need that) |
5. Exploration default — ONE
Even if the runtime permits N parallel exploration agents, default to 1 for first-pass reconnaissance. Fan out to 2–3 only when objectively orthogonal subsystems are identified.
| Case | Default count |
|---|
| First-pass reconnaissance ("understand the current state") | 1 |
| Well-scoped single-domain question | 1 |
| Independent subsystems with no shared files (frontend + backend + infra) | 2–3, one per domain |
| Same area, larger search space | 1 (sharper prompt, not more agents) |
Why: three parallel Explores on overlapping scope produce ~3× noise with heavy content overlap, not 3× signal.
Rule: parallelism is a last-step optimization for cost/wall-clock applied after scope is understood — not a default exploration tactic.
6. Merge rules (universal)
After all teammates return, apply these in order:
-
Location dedup: issues at the same (file, line ± 3) with overlapping category → merge, keep highest severity, union descriptions and recommendations.
-
Cross-category re-attribution: if a teammate flagged something belonging to a sibling's domain, re-section under the correct owner's block.
-
Severity escalation (mechanism- and model-aware): same-location agreement between same-base-model teammates is corroboration (the finding survived persona/prompt variation), not independent confirmation — same-model pairs pick the same wrong answer ~60% of the time when erring (arXiv:2506.07962). How much escalation an overlap earns depends on two axes — whether the failure mechanisms differ, and how independent the teammates' models are:
| Critic pair | Independence | Same-mechanism agreement earns |
|---|
| Same model, different persona (default) | none (~60% shared-error) | no escalation — corroborated tag only (R3a) |
Same vendor, different tier via --models (haiku/sonnet/opus/fable) | partial (correlated within family) | no escalation — tier-diverse tag only (R3c escalation refuted by mini-exp 078: cross-tier agreement precision 0.66 < 0.73 same-tier; --models kept for recall) |
| Different vendors (needs item 6 adapters) | quasi-independent | open question — ⏳ deferred (item 6); 078 tested tiers, not true cross-vendor independence |
- Same failure mechanism, same-model (default) → do NOT escalate. Severity = max of the duplicates (rule 1); tag the merged finding
corroborated ("flagged by N teammates — weak positive signal"). [R3a]
- Same failure mechanism, tier-diverse
--models config → do NOT escalate either. Severity = max (rule 1); tag tier-diverse (records heterogeneous-model provenance, no severity consequence). [R3c — escalation demoted to tag-only: mini-exp 078 found cross-tier agreement less precise than same-tier (0.66 vs 0.73), so a +1 would manufacture false positives; the --models config is retained as a recall/coverage tool]
- Different failure mechanisms at the same location (e.g., critic-logic: unhandled edge case; critic-security: exploitable injection at the same line) → two distinct analyses regardless of model config: escalate severity by one level. Mechanism-difference test: the scenarios are not paraphrases of each other — orchestrator judgment, documented in the merged report. [R3b]
Env-flatten note: CLAUDE_CODE_SUBAGENT_MODEL, when set, silently overrides every per-critic model pin and collapses a tier-diverse config back to one model. When that env var is present, the tier-diverse tag is inaccurate — downgrade it to plain corroborated (the run is effectively same-model). No escalation is affected (tier-diverse no longer escalates), but the provenance tag should tell the truth.
-
Bikeshedding filter: any teammate signaling convergence: bikeshedding-only (no legitimate findings left — only style nits) → drop its low-severity items from this iteration.
-
Optional severity filter: drop items below a user-specified minimum severity (e.g. --severity=high).
7. Vendor dispatch & the sequential last resort
First, resolve the runtime (§1.1) and use its native parallel adapter: Claude Code (claude-code.md, complete), Codex / Cursor / Antigravity (scaffolds — parallel documented), Gemini (scaffold — Layer-A unconfirmed). The premise that "non-Claude vendors have no parallel primitives" is obsolete (C-07): Codex spawns-and-consolidates, Cursor runs up to 10 concurrent, Antigravity dispatches async subagents.
Only if the runtime is genuinely primitive-less (no spawn mechanism), or you need a proven path on an unvalidated-adapter runtime, or it's deterministic single-session debugging / 1-slot CI → fall back to references/sequential-fallback.md:
- Role-switching through a single session (persona-swap per teammate role).
- Slower by ~N× wall-clock; loses per-teammate context isolation (everything lands in the same session window).
- A degraded last resort, NOT "functionally equivalent" to parallel (C-07); cannot do Layer B.
All universal concepts (§2–§6) — including merge rules and the evidence contract — apply on every path; only the spawn mechanism changes.
Caveat: the fallback protocol is documented as universal-by-design but has only been validated on Claude Code itself (roleplay as a no-Agent-tool runtime). Until a real non-Claude runtime runs an end-to-end task through it, treat the fallback as a proposed pattern rather than a certified code path. File issues / PRs against references/sequential-fallback.md after your first real run.
8. Scripts and Resources
scripts/spawn_agent_mock.py — DEPRECATED (Wave 1, 2026-04-17). POC mock runner. Retained only for fcntl-locking regression tests in tests/test_mock_agent.py. Do not reference from new workflows.
examples/usage_example.md — Claude Code–specific usage walk-through paired with references/claude-code.md.
references/ — per-vendor reference implementations. See §1 for selection.
9. History
- v3.7 (2026-06-10): finished item 6 in-repo (Task 081). Google Antigravity 4th adapter (
agent.json, dynamic-first + static custom-agent form, async parallel ✅, detection ambiguity documented). 6d: vdd-multi "Fallback (Sequential)" → "Vendor dispatch" (resolve runtime → native adapter; sequential = documented last resort); "functionally equivalent" claim removed from vdd-multi + §7 (C-07). 6e: Wave-5 wrapper generator (scripts/generate_wrappers.py + wrappers_manifest.json → 12 wrappers across 4 vendors, Claude excluded as donor; --check drift mode) + KNOWN_ISSUES drift-grep extended to all 5 wrapper dirs. Remaining for item 6: operator e2e validation only.
- v3.6 (2026-06-10): vendor adapter scaffolds for Codex CLI / Gemini CLI / Cursor (roadmap item 6, sub-tasks 6a–6c, in-repo portion). Three references (stub→full for Gemini/Cursor, NEW
codex-cli.md) + 9 thin critic wrappers (3 vendors × logic/security/performance) at real runtime paths (.gemini/agents/, .codex/agents/, .cursor/agents/), all pointing at the same SOT skills + same convergence enum. Primitives verified against primary docs (geminicli.com, developers.openai.com/codex, cursor.com): Codex + Cursor confirm parallel Layer A (Cursor max-10; Codex consolidates); Gemini's parallel multi-spawn is NOT documented — the scaffold records that gap honestly rather than claiming it. §1.1 gains a Codex row (.codex/agents/). Everything ships ⚠️ SCAFFOLD — not e2e-validated; graduation to ✅ + sub-tasks 6d (sequential demotion) / 6e (drift-grep, Wave-5 generator) remain. Read-only critic guarantee mapped per vendor (sandbox_mode="read-only" / readonly:true / tools whitelist).
- v3.5 (2026-06-10): R3c tier-diverse escalation demoted to tag-only (mini-exp 078,
docs/reviews/tier-diverse-experiment-078.md). The pilot's premise — cross-tier agreement is stronger evidence — was refuted: tier-diverse critics produced more same-location overlaps but a smaller fraction were real (precision 0.66 vs 0.73 same-tier). Merge rule 3 gradation middle row + third bullet now tag tier-diverse without +1. The --models config is retained (078 validated it as a recall/coverage tool: highest recall, 100% pooled). Cross-vendor row stays ⏳ (item 6) — 078 tested tiers, not true vendor independence. Only mechanism-difference (R3b) escalates now.
- v3.4 (2026-06-10): R3c tier-diverse escalation (audit-067 C-08, roadmap item 7 R3c — last open slice). Merge rule 3 gains the model-independence gradation table + a third bullet: same-mechanism agreement under a tier-diverse
--models config (critics on different model tiers, env not flattening) earns +1 for CRITICAL/HIGH only, tag tier-diverse. /vdd-multi gains --models=logic:<t>,security:<t>,performance:<t> (Phase 0 parse + escalation-tier resolution, Phase 1 per-critic spawn) with a CLAUDE_CODE_SUBAGENT_MODEL flatten-guard that downgrades to R3a. Cross-vendor row stays ⏳ (item 6). Ships as pilot — empirical payoff under validation (ab-experiment-075 follow-up). v3.2→3.3 were doc-only (item 9 model-pin hygiene §, item 11 evidence-contract reference bumps).
- v3.1 (2026-06-10): severity-escalation redesign (audit-067 C-08, roadmap item 7 R3a/R3b/R3d). Same-model agreement no longer auto-escalates (+1 →
corroborated tag, severity = max); escalation survives only for different-failure-mechanism overlap at the same location; sequential fallback explicitly never escalates. Rationale: persona-differentiated same-model ensembles share error priors (arXiv:2506.07962, arXiv:2601.12307). R3c (model-heterogeneity gradation) deferred — cross-vendor form blocked by vendor adapters (roadmap item 6).
- v3.0 (2026-04-18): vendor-agnostic rewrite. Universal concepts (§2–§6) stay in
SKILL.md; Claude-specific primitives (Agent tool, .claude/agents/, subagent_type, TeamCreate/SendMessage) extracted to references/claude-code.md. Added references/sequential-fallback.md as universal fallback and stubs for Gemini CLI, Cursor, Antigravity. Extraction point established for Wave 5 (multi-vendor generator).
- v2.0 (Wave 1, 2026-04-17): replaced mock-spawn with native Claude Code
Agent tool (Layer A); added Layer B stub. Single-vendor assumption.
- v1.0 (POC): mock-agent via
spawn_agent_mock.py &. See docs/archives/POC_PARALLEL_AGENTS.md.