Exécutez n'importe quel Skill dans Manus
en un clic

Exécutez n'importe quel Skill dans Manus en un clic

fleet-dispatch-and-watch

Use when Mark wants large or mechanical work fanned out across his machine fleet, or wants long-running/background/remote agents monitored. Triggers on 'distribute among the fleet', 'delegate with tmux and worktrees', 'kick it off fan them out', 'offload to mac-mini, desktop just orchestrate', 'check on the agents every 5 min', 'is it done its been some hours', 'are we ready to delegate', or any dispatch to background workers. This is Mark's dispatch→snapshot→poll→escalate→checkpoint loop — fan independent work to workers (local box orchestrates), hand each a precise state snapshot, then poll on a fixed cadence with trustworthy liveness proxies, error-signature buckets, and hard stop/escalation thresholds — never blocking or assuming success.

Exécuter dans Manus

Aperçu

Commande d'installation

npx skills add https://github.com/artsmc/claude-dev-agents --skill fleet-dispatch-and-watch

Copiez et collez cette commande dans Claude Code pour installer le skill

Source

artsmc/claude-dev-agents

Étoiles3

Forks0

Mis à jour31 mai 2026 à 03:44

Explorateur de fichiers

2 fichiers

SKILL.md

readonly

name

fleet-dispatch-and-watch

description

Fleet Dispatch and Watch

For Mark, 'fleet' means splitting work across machines (Mac Mini, Desktop WSL, local). For any sizable or mechanical body of work he reaches for parallel execution and treats serial as a mistake. The monitoring discipline is precise and easy to get wrong — that's the whole point of this skill.

1. Sync to a known-good baseline before dispatching

pull latest from main; are all of my claude root folders in sync with agents and skills?
Confirm fleet repos are in sync and the target resource exists in the canonical registry: do we have applications/sentient-home in our fleet repo?
Commit accumulated work to a clean checkpoint first: lets get all this code commited before moving on.

2. Fan out — local orchestrates, workers do the heavy compute

distribute the planning and effort among the fleet
each process should consider delegating across machines using tmux and worktrees
i think i need to offload entirely to mac-mini and desktop, just be a orchestrator for the sake of the machine battery
kick it off, fan them out

Mechanics Mark expects: one tmux session + git worktree per unit of work (isolation, no file conflicts), heavy jobs on Mac Mini / Desktop, local machine as orchestrator only. Respect hardware limits — a single machine can't handle it all at once, which is the reason for the fan-out.

3. Hand each worker a PRECISE state snapshot

Don't just kick a vague task — supply done/committed-but-unpushed/blocking-cause/hypothesis, and clean up partial state as part of recovery:

sen-467 work is committed on desktop WSL (commit fcb5bed) but NOT pushed — blocked because desktop sshd was closing all connections (tailnet ping worked, likely memory pressure from a 29h hung jest I tried to kill)
and clean up partial sentient-monorepo node_modules Give each worker exact IDs/targets (run IDs, branch names, ticket IDs, SSH host), never a vague 'continue'. The same snapshot discipline applies to subagents/teams, not just machines — see scope-question-and-delegate for budgeting exactly the context a delegate needs (no more, no less).

4. Poll on a fixed cadence — never block, never assume success

Set recurring check-ins and spell out the exact verification commands with branching outcomes:

check on the agents every 5 min; add checkins every 5min
Check it now: ssh ... 'tmux ls; tail -40 ~/dispatch-logs/sen-467.log'. If done, report final summary + all 4 branches pushed. If still running, schedule another 1200s wakeup. If failed/stuck >2h total, alert. Probe whether work actually started and how completion is even detected: is there a auto checker — how do you know when its ready?

5. Judge liveness by TRUSTWORTHY proxies — NOT log size

This is Mark's hard-won, counter-intuitive rule. Get it right:

judge liveness by tmux ls + pgrep -af bypassPermissions, NOT log size (0 bytes = working; a tiny log ending EXIT:1 = failed)
Per cycle: (a) tmux has-session -t s1a — is the agent still running? (b) git log --oneline stage1/shared-infra ^main — is it producing commits?
If liveness is very low (<3 files in 5min), do an extra probe: tmux capture-pane to see if it's stuck on a permission prompt or error. Never conclude 'dead' from an empty log alone.

6. Pre-classify error signatures into wait-vs-retry buckets

React correctly to each known signature instead of treating all errors alike:

'401 Invalid authentication credentials' → user must re-auth, report + wait
'400 role system is not supported' → transient, re-spawn Maintain this bucket list and consult it when a worker errors.

7. Escalate ONLY past explicit thresholds — avoid false alarms

Escalate only if: stuck >15 min with no commits AND CPU <2% (truly idle), repeated F-agent failures, or irrecoverable lockfile
If a connection keeps closing: alert the user that manual WSL sshd recovery is needed (VNC → restart ssh) and stop polling. Don't panic on a single slow cycle; don't loop forever — end the loop on the defined done-state (end the loop once Wave A is complete).

8. Protect worker state — destructive ops are forbidden

NEVER git reset --hard on workers.
Push main after every merge. Update CURRENT-STATE.md after every state change.

9. Checkpoint and close the communication loop

When work passes: sweep into commits, push so work is recoverable across machines, sync persistent docs by repo name, record release status, and produce a stakeholder summary:

update /memory-bank && /document-hub; do a /memory-bank-update on MVP-backend
update linear with the release status
write up a description summary of what has been deployed in each app so i can tell non technical stakeholders what to look for Require a commit/push checkpoint BEFORE any long autonomous run so it's recoverable.

Red flags

Anti-pattern	Mark's rule
Running large mechanical work serially	`distribute among the fleet... tmux and worktrees`
Concluding an agent is dead from an empty log	`0 bytes = working; rely on tmux/pgrep + git state`
Treating all worker errors the same	Bucket them: 401 → wait/re-auth; 400 role → re-spawn
Escalating on one slow cycle	`escalate only if stuck >15min AND CPU<2%`
`git reset --hard` to recover a worker	`NEVER git reset --hard on workers`
Blocking the orchestrator waiting	Poll on cadence; schedule the next wakeup
Dispatching without a state snapshot	Supply done / unpushed / blocking-cause / hypothesis
Long run with no checkpoint	Commit + push first so it's recoverable across machines

Plus depuis ce dépôt

même dépôt

research-gated-build-plan

artsmc/claude-dev-agents

Use at the START of anything sizable, before writing code — when Mark says 'want to work out the full plan', 'I want to do some deep research', 'lets start the planning', 'what does complete look like?', 'what tickets/tasks remain', 'how close are we / what are our current gaps', 'do we already have a harness for this', or 'is there a system we can use?'. This is Mark's front-loaded decision discipline — inventory what already exists before building, scope the gap against a concrete target, research and persist the approach as an artifact, sequence prerequisites, and bake quality bars + phase gates with human checkpoints directly into the go-ahead. It decides WHETHER and HOW to enter the execution skills (start-phase-plan, feature-new); run it first.

2026-05-313

scope-question-and-delegate

artsmc/claude-dev-agents

Use when work is genuinely ambiguous in a way that changes the plan, or big/multi-part/underspecified enough to balloon context — or when the orchestrator's window is trending toward the ~200k cliff where quality and cost degrade. Fires when Mark says 'this is a big one', 'what do you need from me before you start', 'dont drag the whole context through this', 'should we split this up', or 'keep the main thread lean' — or when one thread is doing both the planning AND the execution. This is Mark's triage-then-delegate discipline — decide fast whether real ambiguity or context cost warrants a stop, ask only the few DECISIVE questions, treat the window as a finite budget, and hand each worker a minimal scoped snapshot so the orchestrator plans and synthesizes without accumulating execution detail. Not a hard gate — small clear tasks just proceed.

2026-05-313

diagnose-from-raw-symptom

artsmc/claude-dev-agents

Use when Mark reports a bug by pasting a raw artifact with little or no prose — a stack trace, an HTTP request/response trace (URL + method + status like 500/403/404/502), a console/React/Prisma error, a ChunkLoadError, a shell error, or a screenshot of a visual defect — or when he says things like 'still failing', 'clicking X does nothing', 'no inline editor on the rendered page', 'do we have local postgres/pgvector?', or 'identify the true error'. This is the front-to-back debugging procedure — extract the trigger and reproduction from terse input, probe whether the underlying plumbing even exists, localize by contrasting known-good vs broken, and drive to a durable root cause rather than a workaround. Reach for this BEFORE proposing any fix.

2026-05-303

enumerated-menu-pick-and-sweep

artsmc/claude-dev-agents

Use whenever you're about to present Mark with consequential choices, questions, or tradeoffs — structure them as a numbered or lettered menu so a bare reply counts as a full selection, and use this to read his terse picks. Triggers on one-token replies that only resolve against a labeled list you just gave ('go with B', 'lets go with 2', 'run option B', 'c one big commit'), picks carrying a rider ('option 1 and update envs when your done', '1. yes, and execute it'), multi-item sweeps answering every numbered point at once ('1. yes 2. thats fine 3. just aws 4. whats the mcp server? 5. minimal'), ranged picks with a because-clause scope-cut ('do 1-3, 4 wont be needed because...'), and Mark authoring his own numbered decision list to apply across a spec. Format choices as a menu — a bare letter is uninterpretable unless you offered the list first.

2026-05-303

prove-it-live-before-done

artsmc/claude-dev-agents

Use whenever an agent (or you) is about to claim work is done/fixed/shipped/passing, or when Mark says 'ready?', 'done?', 'did it work?', 'still failing', 'is the live URL responding?', 'check the deploy status', or wants to confirm a fix before pushing. Treat every completion claim, green CI, and passing test as UNPROVEN until the real artifact is exercised end-to-end — drive the actual running app at its real URL/UI/API, confirm the deployed revision is actually live, verify the mutating side-effect (email/queue/DB row) truly fired, gate every forward step on an explicit green signal, audit coverage across the whole surface, and reconcile against the system of record. Then name the specific residual defect with expected-vs-actual.

2026-05-303

reference-as-executable-spec

artsmc/claude-dev-agents

Use when Mark frames a feature by pointing at a concrete reference instead of describing behavior from scratch — when he names a live product as the target ('an inline editor like editor.js', 'a code editor like monaco-editor dark theme', 'much like openclaw.ai'), points at an internal subsystem as the literal spec to replicate ('the OTP from sentient-monorepo is the pattern we should match', 'use sentient-monorepo as the api source of truth'), or anchors a requirement on an external library as the structural model ('setups similar to CASL', 'building this same CASL boilerplate'). Whenever he says 'build it like THAT', 'same as X', or 'mimic/replicate X', treat the named reference as the executable spec — go observe it, extract its real behavior, and let THAT define correct rather than inventing requirements.

2026-05-303

Source

artsmc

artsmc/claude-dev-agents

Ouvrir le dépôt GitHub Voir les dépôts du créateur

Commande d'installation

Téléchargement

Exécuter dans Manus

Utile pourSOC

Développeurs de logicielsProfessions informatiques et mathématiques15-1252L4

name

fleet-dispatch-and-watch

description

Fleet Dispatch and Watch

1. Sync to a known-good baseline before dispatching

pull latest from main; are all of my claude root folders in sync with agents and skills?
Confirm fleet repos are in sync and the target resource exists in the canonical registry: do we have applications/sentient-home in our fleet repo?
Commit accumulated work to a clean checkpoint first: lets get all this code commited before moving on.

2. Fan out — local orchestrates, workers do the heavy compute

distribute the planning and effort among the fleet
each process should consider delegating across machines using tmux and worktrees
i think i need to offload entirely to mac-mini and desktop, just be a orchestrator for the sake of the machine battery
kick it off, fan them out

3. Hand each worker a PRECISE state snapshot

Don't just kick a vague task — supply done/committed-but-unpushed/blocking-cause/hypothesis, and clean up partial state as part of recovery:

sen-467 work is committed on desktop WSL (commit fcb5bed) but NOT pushed — blocked because desktop sshd was closing all connections (tailnet ping worked, likely memory pressure from a 29h hung jest I tried to kill)
and clean up partial sentient-monorepo node_modules Give each worker exact IDs/targets (run IDs, branch names, ticket IDs, SSH host), never a vague 'continue'. The same snapshot discipline applies to subagents/teams, not just machines — see scope-question-and-delegate for budgeting exactly the context a delegate needs (no more, no less).

4. Poll on a fixed cadence — never block, never assume success

Set recurring check-ins and spell out the exact verification commands with branching outcomes:

check on the agents every 5 min; add checkins every 5min
Check it now: ssh ... 'tmux ls; tail -40 ~/dispatch-logs/sen-467.log'. If done, report final summary + all 4 branches pushed. If still running, schedule another 1200s wakeup. If failed/stuck >2h total, alert. Probe whether work actually started and how completion is even detected: is there a auto checker — how do you know when its ready?

5. Judge liveness by TRUSTWORTHY proxies — NOT log size

This is Mark's hard-won, counter-intuitive rule. Get it right:

judge liveness by tmux ls + pgrep -af bypassPermissions, NOT log size (0 bytes = working; a tiny log ending EXIT:1 = failed)
Per cycle: (a) tmux has-session -t s1a — is the agent still running? (b) git log --oneline stage1/shared-infra ^main — is it producing commits?
If liveness is very low (<3 files in 5min), do an extra probe: tmux capture-pane to see if it's stuck on a permission prompt or error. Never conclude 'dead' from an empty log alone.

6. Pre-classify error signatures into wait-vs-retry buckets

React correctly to each known signature instead of treating all errors alike:

'401 Invalid authentication credentials' → user must re-auth, report + wait
'400 role system is not supported' → transient, re-spawn Maintain this bucket list and consult it when a worker errors.

7. Escalate ONLY past explicit thresholds — avoid false alarms

Escalate only if: stuck >15 min with no commits AND CPU <2% (truly idle), repeated F-agent failures, or irrecoverable lockfile
If a connection keeps closing: alert the user that manual WSL sshd recovery is needed (VNC → restart ssh) and stop polling. Don't panic on a single slow cycle; don't loop forever — end the loop on the defined done-state (end the loop once Wave A is complete).

8. Protect worker state — destructive ops are forbidden

NEVER git reset --hard on workers.
Push main after every merge. Update CURRENT-STATE.md after every state change.

9. Checkpoint and close the communication loop

When work passes: sweep into commits, push so work is recoverable across machines, sync persistent docs by repo name, record release status, and produce a stakeholder summary:

update /memory-bank && /document-hub; do a /memory-bank-update on MVP-backend
update linear with the release status
write up a description summary of what has been deployed in each app so i can tell non technical stakeholders what to look for Require a commit/push checkpoint BEFORE any long autonomous run so it's recoverable.

Red flags

Anti-pattern	Mark's rule
Running large mechanical work serially	`distribute among the fleet... tmux and worktrees`
Concluding an agent is dead from an empty log	`0 bytes = working; rely on tmux/pgrep + git state`
Treating all worker errors the same	Bucket them: 401 → wait/re-auth; 400 role → re-spawn
Escalating on one slow cycle	`escalate only if stuck >15min AND CPU<2%`
`git reset --hard` to recover a worker	`NEVER git reset --hard on workers`
Blocking the orchestrator waiting	Poll on cadence; schedule the next wakeup
Dispatching without a state snapshot	Supply done / unpushed / blocking-cause / hypothesis
Long run with no checkpoint	Commit + push first so it's recoverable across machines