Run any Skill in Manus with one click

$pwd:

agent-conductor

Name: Agent Conductor
Author: DUBSOpenHub

// Multi-agent fleet conductor with real-time TUI observability. Launches multiple Terminal Stampede commander groups, keeps them coordinated through live collaboration ledgers, and applies sealed Shadow Score evaluation. Say "agent conductor" to start.

Run Skill in Manus

$ git log --oneline --stat

stars:0

forks:0

updated:May 6, 2026 at 13:04

SKILL.md

readonly

package.json

"author": "DUBSOpenHub"

"repository": "DUBSOpenHub/copilot-skills"

View GitHub Repository

$ install --globalskills.sh

$ download --local

Run Skill in Manus

[HINT] Download the complete skill directory including SKILL.md and all related files

Run any Skill with one click

name	agent-conductor
description	Multi-agent fleet conductor with real-time TUI observability. Launches multiple Terminal Stampede commander groups, keeps them coordinated through live collaboration ledgers, and applies sealed Shadow Score evaluation. Say "agent conductor" to start.
tools	["bash","glob","view","sql","ask_user","task","read_agent","list_agents"]

Agent Conductor

Agent Conductor coordinates exactly five commander-led agent groups so they can propose, review, improve, converge, and broadcast learnings while the run is active. It uses Terminal Stampede for visible tmux panes and process lifecycle, bounded sub-agent fan-out semantics, Agent Pulse or Stampede monitor for real-time observability, and Shadow Score for sealed post-run quality gates.

Use user-facing terminology sub-agents. The compatibility ledger filename child-agents.jsonl may appear in paths, but responses should use sub-agents.

Command Grammar

Pattern	Action
`agent conductor`	Ask only for missing mission/repo; default to Premium Max + Agent Pulse + Stampede monitor
`agent conductor on REPO : MISSION`	Launch with Premium Max + Agent Pulse + Stampede monitor unless tier/scale are explicit
`agent conductor premium max on REPO : MISSION`	Launch 5 premium commander groups with Agent Pulse + Stampede monitor
`agent conductor standard small on REPO : MISSION`	Launch 5 standard-tier commander groups with Agent Pulse + Stampede monitor
`agent conductor status [RUN_ID]`	Show concise run stats and insights
`agent conductor teardown RUN_ID`	Stop the underlying Stampede tmux session

Default repo is the current working directory. Mission text after : is required for a launch.

Launch Defaults and Questions

Default launch settings:

model_tier: Premium - frontier/premium models for commanders and sub-agents.
scale: Max - exactly 5 commander groups.
dashboard: Agent Pulse + Stampede monitor.

Do not ask tier, scale, or dashboard questions when those fields are absent. Only ask one question at a time for truly missing launch inputs:

Ask for mission when no text after : was provided.
Ask for repo only when no on REPO was provided and the current working directory is not the intended target or is not a usable repository.

Explicit user input still wins: standard small launches the smaller standard tier, and premium max launches the default max profile explicitly. Scale words do not reduce commander cardinality: every Agent Conductor metaswarm run launches exactly five commanders.

Commander Cardinality Contract

Agent Conductor has one non-negotiable launch shape: exactly five commander groups, commander-001 through commander-005.

Do not launch an Agent Conductor metaswarm with fewer or more than five commanders.
Do not translate small, standard, local capacity, or transient model errors into a lower commander count.
Each commander manifest must bind task_id and commander_id to the same exact value, e.g. commander-003.
The installed Stampede launcher must be the hardened runtime that rejects --metaswarm --count values other than 5, pre-claims exact commander manifests, and writes a terminal bundle.json plus results/commander-###.json for every commander exit path.
If five commanders cannot be launched, mark the run failed or partial and tell the user why. Never silently downscale.

Model Tiers

Premium model rotation:

claude-opus-4.7,gpt-5.5,claude-opus-4.6,gpt-5.4,claude-opus-4.5,gpt-5.2,claude-sonnet-4.6,gpt-5.3-codex,claude-sonnet-4.5,gpt-5.2-codex

Standard model rotation:

claude-sonnet-4.6,gpt-5.4,claude-sonnet-4.5,gpt-5.3-codex,gpt-5.2-codex,gpt-5.2

Never silently use these models for Agent Conductor sub-agents:

claude-haiku-4.5,gpt-5.4-mini,gpt-5-mini,gpt-4.1

This is a no mini/cheap silent downgrade rule: partial capacity must stay partial rather than falling back to cheap or mini models without explicit user direction.

Pass the chosen tier through the run state, commander manifests, and Stampede environment. Stampede exposes the active policy as:

STAMPEDE_MODEL_POLICY
STAMPEDE_PREMIUM_MODEL_POOL
STAMPEDE_BANNED_CHILD_MODELS
STAMPEDE_SQUAD_LEADS_PER_COMMANDER
STAMPEDE_WORKERS_PER_SQUAD_LEAD
STAMPEDE_WORKERS_PER_COMMANDER

Prerequisites and Preflight

Before launch, verify the local runtime exists:

command -v copilot
command -v tmux
command -v python3
command -v git
test -x ~/bin/stampede.sh
test -x ~/bin/stampede-monitor.sh
grep -q "stampede_synthesize_commander_bundle" ~/bin/stampede.sh
grep -q "Your exact commander task is pre-claimed" ~/bin/stampede.sh
grep -q "requires exactly 5 commanders" ~/bin/stampede.sh

If ~/bin/stampede.sh or ~/bin/stampede-monitor.sh is missing, stop and tell the user to run the Agent Conductor quick installer or install Terminal Stampede first. Do not start an Agent Conductor run without the Stampede launcher and monitor because the dashboard and recovery contracts depend on their run files.

If any hardened-launcher marker check fails, stop before launch and reinstall the tested Terminal Stampede runtime. Do not run Agent Conductor against a stale launcher, because stale installed copies can start five panes while omitting deterministic pre-claim and terminal bundle synthesis.

Run Layout

Agent Conductor reuses Stampede's repo-local runtime so Agent Pulse and tmux monitors work without another service:

REPO/.stampede/RUN_ID/
  state.json
  fleet.json
  queue/
  claimed/
  results/
  commanders/commander-###/
    manifest.json
    context-capsule.json
    assignments.json
    swarm-state.json
    child-agents.jsonl
    bundle.json
    atoms/
    logs/
  collab/
    protocol.json
    proposals.jsonl
    reviews.jsonl
    improvements.jsonl
    consensus.jsonl
    broadcasts.jsonl
  shadow-score/
    sealed/
      criteria.json
    seal.sha256
    scorecard.json
  orchestrator-commentary.json
  orchestrator-commentary.jsonl

orchestrator-commentary.json must stay concise: stats and insights only. Good shape:

cmd 3/5 active · sub-agents 112 running / 480 done / 620 seen · q 0 · claimed 3 · results 2/5
collab p5 r18 i11 c8 b7
commander-004 launching_workers · squads 32/50 · sub-agents 160/250 · run 42 done 118 fail 0

Structured schema:

{
  "ts": "ISO-8601",
  "run_id": "run-...",
  "profile": "metaswarm",
  "queue": 0,
  "claimed": 0,
  "results": 5,
  "total_tasks": 5,
  "lines": [
    "cmd 0/5 active · sub-agents 0 running / 1250 done / 1250 seen · q 0 · claimed 0 · results 5/5",
    "collab p5 r18 i11 c8 b7"
  ],
  "agents": [
    {
      "id": "commander-001",
      "status": "success",
      "phase": "complete",
      "active": false,
      "model": "claude-opus-4.7",
      "heartbeat_age_s": 4,
      "squad_leads": 50,
      "squad_target": 50,
      "workers": 250,
      "worker_target": 250,
      "sub_agents_seen": 300,
      "sub_agents_running": 0,
      "sub_agents_done": 300,
      "sub_agents_failed": 0
    }
  ]
}

Step 0 - SQL Tracking

Create tables if needed:

CREATE TABLE IF NOT EXISTS agent_conductor_runs (
    run_id TEXT PRIMARY KEY,
    repo_path TEXT NOT NULL,
    mission TEXT NOT NULL,
    model_tier TEXT NOT NULL,
    scale TEXT NOT NULL,
    commander_count INTEGER NOT NULL,
    status TEXT DEFAULT 'running',
    created_at TEXT DEFAULT (datetime('now')),
    updated_at TEXT DEFAULT (datetime('now'))
);

CREATE TABLE IF NOT EXISTS agent_conductor_events (
    run_id TEXT NOT NULL,
    ts TEXT DEFAULT (datetime('now')),
    event TEXT NOT NULL,
    detail TEXT
);

Step 1 - Parse and Normalize

Extract:

Field	Source	Default
`repo_path`	`on REPO`	current working directory
`mission`	text after `:`	ask if missing
`model_tier`	premium/standard	premium
`scale`	small/standard/max	max
`dashboard`	requested dashboard	Agent Pulse + Stampede monitor

Map scale to commander count. These labels are preserved for user intent and future policy, but they all keep the same five-commander metaswarm cardinality:

Scale	Commanders
Small	5
Standard	5
Max	5

Step 2 - Create Run Directory

Use run-YYYYMMDD-HHMMSS.

RUN_ID="$(python3 - <<'PY'
import datetime
print("run-" + datetime.datetime.now().strftime("%Y%m%d-%H%M%S"))
PY
)"
BASE="$REPO_PATH/.stampede/$RUN_ID"
mkdir -p "$BASE"/{queue,claimed,results,logs,pids,scripts,commanders,collab,shadow-score/sealed}

Write state.json with Python. Include:

{
  "run_id": "run-YYYYMMDD-HHMMSS",
  "profile": "metaswarm",
  "product": "agent-conductor",
  "runtime_profile": "metaswarm",
  "repo_path": "/abs/repo",
  "mission": "user mission",
  "model_tier": "premium | standard",
  "scale": "small | standard | max",
  "commander_count": 5,
  "collab": {"path": "BASE/collab"},
  "shadow_score": {"sealed": true, "seal_hash": "sha256:..."},
  "phase": "preparing",
  "status": "running"
}

Step 3 - Create Sealed Shadow Score Envelope

Generate hidden acceptance criteria before any commander starts. Do not include criteria contents in commander prompts, manifests, collaboration ledgers, or responses while the run is active.

Create shadow-score/sealed/criteria.json:

{
  "mission": "original mission",
  "generated_at": "ISO-8601",
  "dimensions": [
    "requirement_coverage",
    "collaboration_quality",
    "evidence_quality",
    "test_validation_impact",
    "synthesis_usefulness"
  ],
  "acceptance_checks": [
    "Final synthesis answers the mission directly.",
    "At least two commanders review another commander's proposal.",
    "Consensus items cite source refs from commander bundles or collab ledgers.",
    "Partial runs are explicitly labeled partial with launch counts."
  ]
}

Hash it:

shasum -a 256 "$BASE/shadow-score/sealed/criteria.json" > "$BASE/shadow-score/seal.sha256"

Only expose the hash in state.json and commander manifests.

Step 4 - Collaboration Bus

Create append-only ledgers:

collab/proposals.jsonl
collab/reviews.jsonl
collab/improvements.jsonl
collab/consensus.jsonl
collab/broadcasts.jsonl

Protocol sequence:

propose -> peer_review -> improve -> consensus -> broadcast -> adopt

Create collab/protocol.json with Python json.dump before commanders start:

{
  "version": 1,
  "workflow": ["propose", "peer_review", "improve", "consensus", "broadcast", "adopt"],
  "ledgers": {
    "proposals": "proposals.jsonl",
    "reviews": "reviews.jsonl",
    "improvements": "improvements.jsonl",
    "consensus": "consensus.jsonl",
    "broadcasts": "broadcasts.jsonl"
  },
  "record_required_fields": ["ts", "run_id", "commander_id", "event", "item_id", "summary", "evidence", "confidence", "source_refs"],
  "append_only": true
}

Every record should include:

{
  "ts": "ISO-8601",
  "run_id": "run-...",
  "commander_id": "commander-001",
  "event": "proposal | peer_review | improvement | consensus | broadcast",
  "item_id": "stable-id",
  "summary": "short",
  "evidence": [],
  "confidence": 0.0,
  "source_refs": []
}

adopt is not a separate ledger. It is represented by a commander writing an improvement or consensus record that cites the broadcasts.jsonl item it adopted in source_refs.

Depth Guard

Every commander and worker prompt MUST carry depth_config with can_launch, current_depth, and max_depth.

A commander MUST NOT spawn or launch sub-agents when can_launch is false or current_depth >= max_depth.

Before spawning sub-agents, increment current_depth by 1 and set can_launch for the next depth from the same rule.

Step 5 - Commander Manifests

Write exactly five queue manifests, one per commander. Use different domains so commander groups overlap enough to collaborate but still bring distinct views.

Recommended domain rotation:

architecture, implementation, telemetry-dashboard, model-depth-recovery, synthesis-shadow-score

Each manifest must include:

{
  "task_id": "commander-001",
  "commander_id": "commander-001",
  "run_id": "run-...",
  "kind": "commander",
  "role": "commander",
  "profile": "metaswarm",
  "product": "agent-conductor",
  "runtime_profile": "metaswarm",
  "swarm_scale": "ss-250",
  "per_commander_full_swarm": true,
  "objective": "original mission",
  "mission": "original mission",
  "domain": "architecture",
  "repo_path": "/abs/repo",
  "branch": "havoc-swarm/run-.../commander-001",
  "model_tier": "premium",
  "model_policy": "premium",
  "premium_model_pool": ["claude-opus-4.7", "gpt-5.5", "claude-opus-4.6", "gpt-5.4"],
  "banned_child_models": ["claude-haiku-4.5", "gpt-5.4-mini", "gpt-5-mini", "gpt-4.1"],
  "shadow_score": {"sealed": true, "seal_hash": "sha256:..."},
  "collab": {"path": "BASE/collab", "protocol": "BASE/collab/protocol.json"},
  "constraints": {
    "max_workers": 250,
    "squad_leads_per_commander": 50,
    "workers_per_squad_lead": 5,
    "workers_per_commander": 250,
    "required_status_values": ["success", "partial", "failed"]
  },
  "depth_budget": {"squads_allocated": 50, "squads_max": 50},
  "depth_config": {"can_launch": true, "current_depth": 0, "max_depth": 2},
  "launch_proof": {
    "required": true,
    "state_file": "BASE/commanders/commander-001/swarm-state.json",
    "child_ledger": "BASE/commanders/commander-001/child-agents.jsonl",
    "sub_agent_ledger": "BASE/commanders/commander-001/child-agents.jsonl"
  },
  "telemetry_contract": {
    "state_file": "BASE/commanders/commander-001/swarm-state.json",
    "child_ledger": "BASE/commanders/commander-001/child-agents.jsonl",
    "sub_agent_ledger": "BASE/commanders/commander-001/child-agents.jsonl",
    "launch_proof_required": true
  }
}

The launcher and skill must treat task_id != commander_id as a hard binding failure. Do not let a commander claim another commander's manifest, and do not repair a bad claim by relabeling the commander after launch.

Runtime note: Stampede may use profile: metaswarm internally so existing monitor and Agent Pulse parsers work. product: agent-conductor preserves the user-facing product identity.

`swarm-state.json` schema

Each commander must keep this state fresh with Python json.dump:

{
  "commander_id": "commander-001",
  "run_id": "run-...",
  "status": "running | success | partial | failed",
  "phase": "launching_workers | collecting | complete",
  "model_policy": "premium",
  "squads_target": 50,
  "workers_target": 250,
  "squad_leads_launched": 50,
  "squad_leads_running": 0,
  "squad_leads_completed": 50,
  "squad_leads_failed": 0,
  "workers_launched": 250,
  "workers_running": 0,
  "workers_completed": 250,
  "workers_failed": 0,
  "launch_proof": {
    "state_file": "BASE/commanders/commander-001/swarm-state.json",
    "child_ledger": "BASE/commanders/commander-001/child-agents.jsonl",
    "sub_agent_ledger": "BASE/commanders/commander-001/child-agents.jsonl"
  },
  "updated_at": "ISO-8601"
}

Partial runs are explicitly labeled with launch counts. Never call a partial run a success.

Commander Prompt Template

Commander prompts must tell the commander to read manifest.json, enforce depth_config.can_launch, launch only within constraints.max_workers, write launch proof to swarm-state.json, append compatibility telemetry to child-agents.jsonl, and write JSON only with Python json.dump. The commander must publish at least one collaboration record when it has useful evidence and must write bundle.json with status success, partial, or failed.

Worker Prompt Template

Worker prompts must receive context-capsule.json, assignments.json, and depth_config with can_launch false unless a future explicit depth budget allows deeper work. Workers write atomic findings under atoms/, cite evidence, and use Python json.dump for every JSON file.

Step 6 - Launch Terminal Stampede

Invoke the installed Stampede launcher. --metaswarm gives each commander its own visible pane and nested sub-agent proof contract.

COMMANDER_COUNT=5
if [[ "$COMMANDER_COUNT" -ne 5 ]]; then
  echo "Agent Conductor metaswarm requires exactly 5 commanders" >&2
  exit 1
fi

STAMPEDE_OBJECTIVE="$MISSION" \
STAMPEDE_METASWARM_NO_GATES=1 \
STAMPEDE_MODEL_POLICY="$MODEL_TIER" \
STAMPEDE_PREMIUM_MODEL_POOL="$PREMIUM_MODEL_POOL" \
STAMPEDE_BANNED_CHILD_MODELS="claude-haiku-4.5,gpt-5.4-mini,gpt-5-mini,gpt-4.1" \
STAMPEDE_SQUAD_LEADS_PER_COMMANDER=50 \
STAMPEDE_WORKERS_PER_SQUAD_LEAD=5 \
STAMPEDE_WORKERS_PER_COMMANDER=250 \
~/bin/stampede.sh \
  --metaswarm \
  --run-id "$RUN_ID" \
  --count 5 \
  --repo "$REPO_PATH" \
  --models "$MODEL_POOL" \
  --no-attach

If Agent Pulse is requested, launch or refresh it with the repo scan root:

osascript -e 'tell application "Terminal" to do script "cd ~/copilot-cli-agent-pulse && AGENT_PULSE_SCAN_ROOTS='"$REPO_PATH"' python3 agent_pulse.py --no-splash"' \
  -e 'tell application "Terminal" to activate'

Agent Pulse project link: https://github.com/DUBSOpenHub/copilot-cli-agent-pulse. Use it when the user wants real-time visibility across Copilot sessions, Terminal Stampede runs, commander status, sub-agent counts, and telemetry confidence.

Step 7 - Status and Live Insights

For status, read:

orchestrator-commentary.json for concise stats and insights.
fleet.json for commander models.
commanders/*/swarm-state.json for launch proof.
commanders/*/child-agents.jsonl for compatibility sub-agent telemetry.
collab/*.jsonl for collaboration counts.
results/commander-*.json for terminal bundles.

For Agent Conductor, expected terminal bundle count is always exactly 5. If fewer than five result bundles exist, report the run as running, partial, or failed; do not begin final synthesis.

Do not provide heavy narration. Report compactly:

RUN run-... | cmd 3/5 active | sub-agents 112 running / 480 done / 620 seen | results 2/5
collab p5 r18 i11 c8 b7
active: commander-004 launching_workers, commander-005 collecting

Swarm-Style Chat Commentary

In chat, use a visible swarm-style format: phase banner, progress table, clear status word, and no hidden process narration. Do not make the user infer health from dashboard numbers alone.

Use this shape for launch/status/stop updates:

🐝 AGENT CONDUCTOR — EXECUTION
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━

RUN run-...  STATUS running|partial|failed|stopped

Commander       Model              Phase                Sub-agents        State
commander-001   claude-opus-4.7    launching_workers    120/250          active
commander-002   gpt-5.5            complete             250/250          success
commander-003   claude-opus-4.6    failed_startup       0/250            failed

Totals: cmd 2/5 active · sub-agents 42 running / 370 done / 620 seen · results 3/5
Decision: wait | stop | synthesize partial | relaunch

Rules:

Lead with the health state, not implementation details.
Show every commander in a small table when the user is debugging a run.
Distinguish requested, started/seen, running, done, and failed.
If commanders fail at startup, say failed_startup and recommend stop/relaunch rather than saying the run is simply "still running".
When stopping, report teardown outcome: processes, tmux session, queue/claims, result bundle count, and final run state.
Keep the final synthesis separate from live commentary.

Step 8 - Recovery

If a commander dies before a terminal bundle:

Check Agent Pulse commander_alerts or PID/tmux state.
Expect hardened Stampede to synthesize that commander's failed or partial bundle.json and results/commander-###.json.
If no terminal result exists and the commander heartbeat is stale with no tmux process, requeue the exact same commander manifest unless generation is exhausted.
Mark unrecoverable commanders partial or failed; never call a partial Agent Conductor run a full success.
Keep existing collab ledgers and Shadow Score envelope immutable.

For teardown:

~/bin/stampede.sh --teardown --run-id "$RUN_ID" --repo "$REPO_PATH"

Step 9 - Final Synthesis and Shadow Score

After results/commander-*.json count equals exactly 5:

Load all commander bundles.
Verify every result has commander_id and task_id matching its filename: commander-001 through commander-005.
Verify each commander has launch proof in swarm-state.json and child-agents.jsonl; if a commander has zero worker launch proof, keep that commander partial or failed and carry the caveat into final synthesis.
Load collab/proposals.jsonl, reviews.jsonl, improvements.jsonl, consensus.jsonl, and broadcasts.jsonl.
Recompute the hash for shadow-score/sealed/criteria.json and compare it to shadow-score/seal.sha256. If it differs, stop and mark the scorecard as failed due to seal mismatch.
Open the sealed Shadow Score criteria only in the orchestrator context.
Score each commander and the final synthesis on:
- requirement coverage
- collaboration quality
- evidence quality
- test/validation impact
- synthesis usefulness
Write shadow-score/scorecard.json.
Produce a final answer with:
- commander status table
- concise collaboration stats
- Shadow Score summary
- best findings and consensus
- partial/failure caveats
- next action

scorecard.json schema:

{
  "run_id": "run-...",
  "seal_verified": true,
  "criteria_hash": "sha256:...",
  "overall_score": 0.0,
  "status": "success | partial | failed",
  "dimensions": {
    "requirement_coverage": 0.0,
    "collaboration_quality": 0.0,
    "evidence_quality": 0.0,
    "test_validation_impact": 0.0,
    "synthesis_usefulness": 0.0
  },
  "commander_scores": [
    {"commander_id": "commander-001", "status": "success", "score": 0.0, "evidence_refs": []}
  ],
  "partial_caveats": []
}

Completion Criteria

Run directory exists with state, collab bus, sealed Shadow Score hash, queue, commander telemetry, and concise stats feed.
Stampede launched with --metaswarm --count 5.
Agent Pulse or Stampede monitor can show concise stats and insights.
Exactly five commander manifests exist and each binds task_id to the same commander_id.
Exactly five terminal result bundles exist under results/commander-*.json.
Every commander has success, partial, or failed; missing worker launch proof prevents success.
Final synthesis includes Shadow Score results without exposing sealed criteria.

name	agent-conductor
description	Multi-agent fleet conductor with real-time TUI observability. Launches multiple Terminal Stampede commander groups, keeps them coordinated through live collaboration ledgers, and applies sealed Shadow Score evaluation. Say "agent conductor" to start.
tools	["bash","glob","view","sql","ask_user","task","read_agent","list_agents"]

Agent Conductor

Use user-facing terminology sub-agents. The compatibility ledger filename child-agents.jsonl may appear in paths, but responses should use sub-agents.

Command Grammar

Pattern	Action
`agent conductor`	Ask only for missing mission/repo; default to Premium Max + Agent Pulse + Stampede monitor
`agent conductor on REPO : MISSION`	Launch with Premium Max + Agent Pulse + Stampede monitor unless tier/scale are explicit
`agent conductor premium max on REPO : MISSION`	Launch 5 premium commander groups with Agent Pulse + Stampede monitor
`agent conductor standard small on REPO : MISSION`	Launch 5 standard-tier commander groups with Agent Pulse + Stampede monitor
`agent conductor status [RUN_ID]`	Show concise run stats and insights
`agent conductor teardown RUN_ID`	Stop the underlying Stampede tmux session

Default repo is the current working directory. Mission text after : is required for a launch.

Launch Defaults and Questions

Default launch settings:

model_tier: Premium - frontier/premium models for commanders and sub-agents.
scale: Max - exactly 5 commander groups.
dashboard: Agent Pulse + Stampede monitor.

Do not ask tier, scale, or dashboard questions when those fields are absent. Only ask one question at a time for truly missing launch inputs:

Ask for mission when no text after : was provided.
Ask for repo only when no on REPO was provided and the current working directory is not the intended target or is not a usable repository.

Commander Cardinality Contract

Agent Conductor has one non-negotiable launch shape: exactly five commander groups, commander-001 through commander-005.

Do not launch an Agent Conductor metaswarm with fewer or more than five commanders.
Do not translate small, standard, local capacity, or transient model errors into a lower commander count.
Each commander manifest must bind task_id and commander_id to the same exact value, e.g. commander-003.
The installed Stampede launcher must be the hardened runtime that rejects --metaswarm --count values other than 5, pre-claims exact commander manifests, and writes a terminal bundle.json plus results/commander-###.json for every commander exit path.
If five commanders cannot be launched, mark the run failed or partial and tell the user why. Never silently downscale.

Model Tiers

Premium model rotation:

claude-opus-4.7,gpt-5.5,claude-opus-4.6,gpt-5.4,claude-opus-4.5,gpt-5.2,claude-sonnet-4.6,gpt-5.3-codex,claude-sonnet-4.5,gpt-5.2-codex

Standard model rotation:

claude-sonnet-4.6,gpt-5.4,claude-sonnet-4.5,gpt-5.3-codex,gpt-5.2-codex,gpt-5.2

Never silently use these models for Agent Conductor sub-agents:

claude-haiku-4.5,gpt-5.4-mini,gpt-5-mini,gpt-4.1

This is a no mini/cheap silent downgrade rule: partial capacity must stay partial rather than falling back to cheap or mini models without explicit user direction.

Pass the chosen tier through the run state, commander manifests, and Stampede environment. Stampede exposes the active policy as:

STAMPEDE_MODEL_POLICY
STAMPEDE_PREMIUM_MODEL_POOL
STAMPEDE_BANNED_CHILD_MODELS
STAMPEDE_SQUAD_LEADS_PER_COMMANDER
STAMPEDE_WORKERS_PER_SQUAD_LEAD
STAMPEDE_WORKERS_PER_COMMANDER

Prerequisites and Preflight

Before launch, verify the local runtime exists:

command -v copilot
command -v tmux
command -v python3
command -v git
test -x ~/bin/stampede.sh
test -x ~/bin/stampede-monitor.sh
grep -q "stampede_synthesize_commander_bundle" ~/bin/stampede.sh
grep -q "Your exact commander task is pre-claimed" ~/bin/stampede.sh
grep -q "requires exactly 5 commanders" ~/bin/stampede.sh

Run Layout

Agent Conductor reuses Stampede's repo-local runtime so Agent Pulse and tmux monitors work without another service:

REPO/.stampede/RUN_ID/
  state.json
  fleet.json
  queue/
  claimed/
  results/
  commanders/commander-###/
    manifest.json
    context-capsule.json
    assignments.json
    swarm-state.json
    child-agents.jsonl
    bundle.json
    atoms/
    logs/
  collab/
    protocol.json
    proposals.jsonl
    reviews.jsonl
    improvements.jsonl
    consensus.jsonl
    broadcasts.jsonl
  shadow-score/
    sealed/
      criteria.json
    seal.sha256
    scorecard.json
  orchestrator-commentary.json
  orchestrator-commentary.jsonl

orchestrator-commentary.json must stay concise: stats and insights only. Good shape:

cmd 3/5 active · sub-agents 112 running / 480 done / 620 seen · q 0 · claimed 3 · results 2/5
collab p5 r18 i11 c8 b7
commander-004 launching_workers · squads 32/50 · sub-agents 160/250 · run 42 done 118 fail 0

Structured schema:

{
  "ts": "ISO-8601",
  "run_id": "run-...",
  "profile": "metaswarm",
  "queue": 0,
  "claimed": 0,
  "results": 5,
  "total_tasks": 5,
  "lines": [
    "cmd 0/5 active · sub-agents 0 running / 1250 done / 1250 seen · q 0 · claimed 0 · results 5/5",
    "collab p5 r18 i11 c8 b7"
  ],
  "agents": [
    {
      "id": "commander-001",
      "status": "success",
      "phase": "complete",
      "active": false,
      "model": "claude-opus-4.7",
      "heartbeat_age_s": 4,
      "squad_leads": 50,
      "squad_target": 50,
      "workers": 250,
      "worker_target": 250,
      "sub_agents_seen": 300,
      "sub_agents_running": 0,
      "sub_agents_done": 300,
      "sub_agents_failed": 0
    }
  ]
}

Step 0 - SQL Tracking

Create tables if needed:

CREATE TABLE IF NOT EXISTS agent_conductor_runs (
    run_id TEXT PRIMARY KEY,
    repo_path TEXT NOT NULL,
    mission TEXT NOT NULL,
    model_tier TEXT NOT NULL,
    scale TEXT NOT NULL,
    commander_count INTEGER NOT NULL,
    status TEXT DEFAULT 'running',
    created_at TEXT DEFAULT (datetime('now')),
    updated_at TEXT DEFAULT (datetime('now'))
);

CREATE TABLE IF NOT EXISTS agent_conductor_events (
    run_id TEXT NOT NULL,
    ts TEXT DEFAULT (datetime('now')),
    event TEXT NOT NULL,
    detail TEXT
);

Step 1 - Parse and Normalize

Extract:

Field	Source	Default
`repo_path`	`on REPO`	current working directory
`mission`	text after `:`	ask if missing
`model_tier`	premium/standard	premium
`scale`	small/standard/max	max
`dashboard`	requested dashboard	Agent Pulse + Stampede monitor

Map scale to commander count. These labels are preserved for user intent and future policy, but they all keep the same five-commander metaswarm cardinality:

Scale	Commanders
Small	5
Standard	5
Max	5

Step 2 - Create Run Directory

Use run-YYYYMMDD-HHMMSS.

RUN_ID="$(python3 - <<'PY'
import datetime
print("run-" + datetime.datetime.now().strftime("%Y%m%d-%H%M%S"))
PY
)"
BASE="$REPO_PATH/.stampede/$RUN_ID"
mkdir -p "$BASE"/{queue,claimed,results,logs,pids,scripts,commanders,collab,shadow-score/sealed}

Write state.json with Python. Include:

{
  "run_id": "run-YYYYMMDD-HHMMSS",
  "profile": "metaswarm",
  "product": "agent-conductor",
  "runtime_profile": "metaswarm",
  "repo_path": "/abs/repo",
  "mission": "user mission",
  "model_tier": "premium | standard",
  "scale": "small | standard | max",
  "commander_count": 5,
  "collab": {"path": "BASE/collab"},
  "shadow_score": {"sealed": true, "seal_hash": "sha256:..."},
  "phase": "preparing",
  "status": "running"
}

Step 3 - Create Sealed Shadow Score Envelope

Generate hidden acceptance criteria before any commander starts. Do not include criteria contents in commander prompts, manifests, collaboration ledgers, or responses while the run is active.

Create shadow-score/sealed/criteria.json:

{
  "mission": "original mission",
  "generated_at": "ISO-8601",
  "dimensions": [
    "requirement_coverage",
    "collaboration_quality",
    "evidence_quality",
    "test_validation_impact",
    "synthesis_usefulness"
  ],
  "acceptance_checks": [
    "Final synthesis answers the mission directly.",
    "At least two commanders review another commander's proposal.",
    "Consensus items cite source refs from commander bundles or collab ledgers.",
    "Partial runs are explicitly labeled partial with launch counts."
  ]
}

Hash it:

shasum -a 256 "$BASE/shadow-score/sealed/criteria.json" > "$BASE/shadow-score/seal.sha256"

Only expose the hash in state.json and commander manifests.

Step 4 - Collaboration Bus

Create append-only ledgers:

collab/proposals.jsonl
collab/reviews.jsonl
collab/improvements.jsonl
collab/consensus.jsonl
collab/broadcasts.jsonl

Protocol sequence:

propose -> peer_review -> improve -> consensus -> broadcast -> adopt

Create collab/protocol.json with Python json.dump before commanders start:

{
  "version": 1,
  "workflow": ["propose", "peer_review", "improve", "consensus", "broadcast", "adopt"],
  "ledgers": {
    "proposals": "proposals.jsonl",
    "reviews": "reviews.jsonl",
    "improvements": "improvements.jsonl",
    "consensus": "consensus.jsonl",
    "broadcasts": "broadcasts.jsonl"
  },
  "record_required_fields": ["ts", "run_id", "commander_id", "event", "item_id", "summary", "evidence", "confidence", "source_refs"],
  "append_only": true
}

Every record should include:

{
  "ts": "ISO-8601",
  "run_id": "run-...",
  "commander_id": "commander-001",
  "event": "proposal | peer_review | improvement | consensus | broadcast",
  "item_id": "stable-id",
  "summary": "short",
  "evidence": [],
  "confidence": 0.0,
  "source_refs": []
}

adopt is not a separate ledger. It is represented by a commander writing an improvement or consensus record that cites the broadcasts.jsonl item it adopted in source_refs.

Depth Guard

Every commander and worker prompt MUST carry depth_config with can_launch, current_depth, and max_depth.

A commander MUST NOT spawn or launch sub-agents when can_launch is false or current_depth >= max_depth.

Before spawning sub-agents, increment current_depth by 1 and set can_launch for the next depth from the same rule.

Step 5 - Commander Manifests

Write exactly five queue manifests, one per commander. Use different domains so commander groups overlap enough to collaborate but still bring distinct views.

Recommended domain rotation:

architecture, implementation, telemetry-dashboard, model-depth-recovery, synthesis-shadow-score

Each manifest must include:

{
  "task_id": "commander-001",
  "commander_id": "commander-001",
  "run_id": "run-...",
  "kind": "commander",
  "role": "commander",
  "profile": "metaswarm",
  "product": "agent-conductor",
  "runtime_profile": "metaswarm",
  "swarm_scale": "ss-250",
  "per_commander_full_swarm": true,
  "objective": "original mission",
  "mission": "original mission",
  "domain": "architecture",
  "repo_path": "/abs/repo",
  "branch": "havoc-swarm/run-.../commander-001",
  "model_tier": "premium",
  "model_policy": "premium",
  "premium_model_pool": ["claude-opus-4.7", "gpt-5.5", "claude-opus-4.6", "gpt-5.4"],
  "banned_child_models": ["claude-haiku-4.5", "gpt-5.4-mini", "gpt-5-mini", "gpt-4.1"],
  "shadow_score": {"sealed": true, "seal_hash": "sha256:..."},
  "collab": {"path": "BASE/collab", "protocol": "BASE/collab/protocol.json"},
  "constraints": {
    "max_workers": 250,
    "squad_leads_per_commander": 50,
    "workers_per_squad_lead": 5,
    "workers_per_commander": 250,
    "required_status_values": ["success", "partial", "failed"]
  },
  "depth_budget": {"squads_allocated": 50, "squads_max": 50},
  "depth_config": {"can_launch": true, "current_depth": 0, "max_depth": 2},
  "launch_proof": {
    "required": true,
    "state_file": "BASE/commanders/commander-001/swarm-state.json",
    "child_ledger": "BASE/commanders/commander-001/child-agents.jsonl",
    "sub_agent_ledger": "BASE/commanders/commander-001/child-agents.jsonl"
  },
  "telemetry_contract": {
    "state_file": "BASE/commanders/commander-001/swarm-state.json",
    "child_ledger": "BASE/commanders/commander-001/child-agents.jsonl",
    "sub_agent_ledger": "BASE/commanders/commander-001/child-agents.jsonl",
    "launch_proof_required": true
  }
}

Runtime note: Stampede may use profile: metaswarm internally so existing monitor and Agent Pulse parsers work. product: agent-conductor preserves the user-facing product identity.

`swarm-state.json` schema

Each commander must keep this state fresh with Python json.dump:

{
  "commander_id": "commander-001",
  "run_id": "run-...",
  "status": "running | success | partial | failed",
  "phase": "launching_workers | collecting | complete",
  "model_policy": "premium",
  "squads_target": 50,
  "workers_target": 250,
  "squad_leads_launched": 50,
  "squad_leads_running": 0,
  "squad_leads_completed": 50,
  "squad_leads_failed": 0,
  "workers_launched": 250,
  "workers_running": 0,
  "workers_completed": 250,
  "workers_failed": 0,
  "launch_proof": {
    "state_file": "BASE/commanders/commander-001/swarm-state.json",
    "child_ledger": "BASE/commanders/commander-001/child-agents.jsonl",
    "sub_agent_ledger": "BASE/commanders/commander-001/child-agents.jsonl"
  },
  "updated_at": "ISO-8601"
}

Partial runs are explicitly labeled with launch counts. Never call a partial run a success.

Commander Prompt Template

Worker Prompt Template

Step 6 - Launch Terminal Stampede

Invoke the installed Stampede launcher. --metaswarm gives each commander its own visible pane and nested sub-agent proof contract.

COMMANDER_COUNT=5
if [[ "$COMMANDER_COUNT" -ne 5 ]]; then
  echo "Agent Conductor metaswarm requires exactly 5 commanders" >&2
  exit 1
fi

STAMPEDE_OBJECTIVE="$MISSION" \
STAMPEDE_METASWARM_NO_GATES=1 \
STAMPEDE_MODEL_POLICY="$MODEL_TIER" \
STAMPEDE_PREMIUM_MODEL_POOL="$PREMIUM_MODEL_POOL" \
STAMPEDE_BANNED_CHILD_MODELS="claude-haiku-4.5,gpt-5.4-mini,gpt-5-mini,gpt-4.1" \
STAMPEDE_SQUAD_LEADS_PER_COMMANDER=50 \
STAMPEDE_WORKERS_PER_SQUAD_LEAD=5 \
STAMPEDE_WORKERS_PER_COMMANDER=250 \
~/bin/stampede.sh \
  --metaswarm \
  --run-id "$RUN_ID" \
  --count 5 \
  --repo "$REPO_PATH" \
  --models "$MODEL_POOL" \
  --no-attach

If Agent Pulse is requested, launch or refresh it with the repo scan root:

osascript -e 'tell application "Terminal" to do script "cd ~/copilot-cli-agent-pulse && AGENT_PULSE_SCAN_ROOTS='"$REPO_PATH"' python3 agent_pulse.py --no-splash"' \
  -e 'tell application "Terminal" to activate'

Step 7 - Status and Live Insights

For status, read:

orchestrator-commentary.json for concise stats and insights.
fleet.json for commander models.
commanders/*/swarm-state.json for launch proof.
commanders/*/child-agents.jsonl for compatibility sub-agent telemetry.
collab/*.jsonl for collaboration counts.
results/commander-*.json for terminal bundles.

For Agent Conductor, expected terminal bundle count is always exactly 5. If fewer than five result bundles exist, report the run as running, partial, or failed; do not begin final synthesis.

Do not provide heavy narration. Report compactly:

RUN run-... | cmd 3/5 active | sub-agents 112 running / 480 done / 620 seen | results 2/5
collab p5 r18 i11 c8 b7
active: commander-004 launching_workers, commander-005 collecting

Swarm-Style Chat Commentary

In chat, use a visible swarm-style format: phase banner, progress table, clear status word, and no hidden process narration. Do not make the user infer health from dashboard numbers alone.

Use this shape for launch/status/stop updates:

🐝 AGENT CONDUCTOR — EXECUTION
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━

RUN run-...  STATUS running|partial|failed|stopped

Commander       Model              Phase                Sub-agents        State
commander-001   claude-opus-4.7    launching_workers    120/250          active
commander-002   gpt-5.5            complete             250/250          success
commander-003   claude-opus-4.6    failed_startup       0/250            failed

Totals: cmd 2/5 active · sub-agents 42 running / 370 done / 620 seen · results 3/5
Decision: wait | stop | synthesize partial | relaunch

Rules:

Lead with the health state, not implementation details.
Show every commander in a small table when the user is debugging a run.
Distinguish requested, started/seen, running, done, and failed.
If commanders fail at startup, say failed_startup and recommend stop/relaunch rather than saying the run is simply "still running".
When stopping, report teardown outcome: processes, tmux session, queue/claims, result bundle count, and final run state.
Keep the final synthesis separate from live commentary.

Step 8 - Recovery

If a commander dies before a terminal bundle:

Check Agent Pulse commander_alerts or PID/tmux state.
Expect hardened Stampede to synthesize that commander's failed or partial bundle.json and results/commander-###.json.
If no terminal result exists and the commander heartbeat is stale with no tmux process, requeue the exact same commander manifest unless generation is exhausted.
Mark unrecoverable commanders partial or failed; never call a partial Agent Conductor run a full success.
Keep existing collab ledgers and Shadow Score envelope immutable.

For teardown:

~/bin/stampede.sh --teardown --run-id "$RUN_ID" --repo "$REPO_PATH"

Step 9 - Final Synthesis and Shadow Score

After results/commander-*.json count equals exactly 5:

Load all commander bundles.
Verify every result has commander_id and task_id matching its filename: commander-001 through commander-005.
Verify each commander has launch proof in swarm-state.json and child-agents.jsonl; if a commander has zero worker launch proof, keep that commander partial or failed and carry the caveat into final synthesis.
Load collab/proposals.jsonl, reviews.jsonl, improvements.jsonl, consensus.jsonl, and broadcasts.jsonl.
Recompute the hash for shadow-score/sealed/criteria.json and compare it to shadow-score/seal.sha256. If it differs, stop and mark the scorecard as failed due to seal mismatch.
Open the sealed Shadow Score criteria only in the orchestrator context.
Score each commander and the final synthesis on:
- requirement coverage
- collaboration quality
- evidence quality
- test/validation impact
- synthesis usefulness
Write shadow-score/scorecard.json.
Produce a final answer with:
- commander status table
- concise collaboration stats
- Shadow Score summary
- best findings and consensus
- partial/failure caveats
- next action

scorecard.json schema:

{
  "run_id": "run-...",
  "seal_verified": true,
  "criteria_hash": "sha256:...",
  "overall_score": 0.0,
  "status": "success | partial | failed",
  "dimensions": {
    "requirement_coverage": 0.0,
    "collaboration_quality": 0.0,
    "evidence_quality": 0.0,
    "test_validation_impact": 0.0,
    "synthesis_usefulness": 0.0
  },
  "commander_scores": [
    {"commander_id": "commander-001", "status": "success", "score": 0.0, "evidence_refs": []}
  ],
  "partial_caveats": []
}

Completion Criteria

Run directory exists with state, collab bus, sealed Shadow Score hash, queue, commander telemetry, and concise stats feed.
Stampede launched with --metaswarm --count 5.
Agent Pulse or Stampede monitor can show concise stats and insights.
Exactly five commander manifests exist and each binds task_id to the same commander_id.
Exactly five terminal result bundles exist under results/commander-*.json.
Every commander has success, partial, or failed; missing worker launch proof prevents success.
Final synthesis includes Shadow Score results without exposing sealed criteria.

agent-conductor

Agent Conductor

Command Grammar

Launch Defaults and Questions

Commander Cardinality Contract

Model Tiers

Prerequisites and Preflight

Run Layout

Step 0 - SQL Tracking

Step 1 - Parse and Normalize

Step 2 - Create Run Directory

Step 3 - Create Sealed Shadow Score Envelope

Step 4 - Collaboration Bus

Depth Guard

Step 5 - Commander Manifests

swarm-state.json schema

Commander Prompt Template

Worker Prompt Template

Step 6 - Launch Terminal Stampede

Step 7 - Status and Live Insights

Swarm-Style Chat Commentary

Step 8 - Recovery

Step 9 - Final Synthesis and Shadow Score

Completion Criteria

Agent Conductor

Command Grammar

Launch Defaults and Questions

Commander Cardinality Contract

Model Tiers

Prerequisites and Preflight

Run Layout

Step 0 - SQL Tracking

Step 1 - Parse and Normalize

Step 2 - Create Run Directory

Step 3 - Create Sealed Shadow Score Envelope

Step 4 - Collaboration Bus

Depth Guard

Step 5 - Commander Manifests

swarm-state.json schema

Commander Prompt Template

Worker Prompt Template

Step 6 - Launch Terminal Stampede

Step 7 - Status and Live Insights

Swarm-Style Chat Commentary

Step 8 - Recovery

Step 9 - Final Synthesis and Shadow Score

Completion Criteria

`swarm-state.json` schema

`swarm-state.json` schema