| name | deadfish |
| description | deadf(ish) autonomous development pipeline v3.0.0 — heartbeat-driven. Clawdbot cron replaces ralph.sh. Each cycle = fresh isolated session. STATE.yaml = continuity. Strict role separation: Clawdbot orchestrates, GPT-5.2 plans, gpt-5.2-codex implements, verify.sh + Opus sub-agent verify. Transforms vision into shipped code through automated cron-driven cycles with sentinel DSL, deterministic verification, and conservative safety. Use when building software projects or continuing development work. Use when Fred mentions deadfish, pipeline, autonomous coding, or continuing a dev project. Also applies when discussing cycle flow, sentinel blocks, plan/verdict parsing, verify scripts, state management, or heartbeat-driven autonomous development.
|
Quick Reference
| Action | Command / Method |
|---|
| Start pipeline | Create cron job (see Activation) |
| Stop pipeline | Disable cron job OR set phase: needs_human |
| Run one manual cycle | Follow Cycle Protocol directly |
| Plan a task | codex-mcp-call.sh --model gpt-5.2 --sandbox danger-full-access "..." |
| Implement a task | codex-mcp-call.sh --model gpt-5.2-codex --full-auto --cwd <project> "..." |
| Implement (complex) | Same with --model gpt-5.2-codex-high |
| Run verification | ./verify.sh (outputs JSON to stdout) |
| Parse plan output | python3 extract_plan.py --nonce <NONCE> < raw_output |
| Parse verdict output | echo '<json>' | python3 build_verdict.py --nonce <NONCE> --criteria AC1,AC2,... |
| LLM verify criterion | sessions_spawn sub-agent per acceptance criterion |
| Read state | yq -r '.<field>' STATE.yaml |
| Write state | Atomic: flock → yq → temp → mv (see State Writes) |
| Post status | message action=send to pipeline Discord channel |
Architecture
Heartbeat-Driven Execution
┌──────────┐ fires every ┌───────────────┐ dispatch ┌──────────────┐
│ Cron Job │── 3-5 min ────▶│ Isolated │─────────────▶│ GPT-5.2 │
│ (driver) │ │ Session │ │ GPT-5.2-Codex│
└──────────┘ │ │ │ Opus 4.5 │
│ flock → read │ └──────────────┘
┌──────────┐ │ → ONE action │
│ Discord │◀── status ────│ → write state │
│ #pipeline│ one-liner │ → release │
└──────────┘ └───────────────┘
No ralph.sh. No external loop. Clawdbot's cron IS the heartbeat.
Each cron tick spawns a fresh isolated session. The session:
- Acquires
flock (non-blocking — exits if held)
- Reads STATE.yaml
- Runs ONE action
- Updates STATE.yaml atomically
- Posts status to Discord
- Exits (flock released automatically)
STATE.yaml is the continuity. Not the context window.
Five Actors, Strict Boundaries
| Role | Actor | Writes To |
|---|
| Driver | Clawdbot Cron | Fires sessions on schedule |
| Orchestrator | Clawdbot (Claude Opus 4.5) | STATE.yaml (all fields) |
| Planner | GPT-5.2 (via Codex MCP) | stdout only (sentinel plan blocks) |
| Implementer | gpt-5.2-codex (via Codex MCP) | Source code + git commits |
| Verifier (Script) | verify.sh (bash) | stdout only (JSON) |
| Verifier (LLM) | Opus 4.5 sub-agent (sessions_spawn) | stdout only (sentinel verdict blocks) |
No actor crosses into another's domain. Clawdbot orchestrates but never writes code or judges quality. GPT-5.2 plans but never touches files. gpt-5.2-codex implements but never plans or verifies.
Concurrency Guard
The Problem
Two cron ticks could read cycle.status: idle simultaneously and both claim ownership.
The Solution: flock -n
The ENTIRE cycle is wrapped in a non-blocking filesystem lock:
(
flock -n 9 || exit 0
) 9>/path/to/project/.deadf/cycle.flock
flock -n: Non-blocking. If held, exit immediately. Zero wait, zero conflict.
- Lock released automatically when session exits (even on crash — OS handles it).
- Each project has its own
.deadf/cycle.flock. Multiple projects don't interfere.
Stale Lease Recovery
If a session hangs (process alive but stuck), the lease renewal mechanism detects it:
cycle:
status: running
started_at: "2026-01-30T03:30:00Z"
session_key: "agent:main:cron:deadfish-mnemo"
last_heartbeat_at: "2026-01-30T03:35:00Z"
Inside flock, before claiming:
- If
cycle.status == running AND now - last_heartbeat_at > stale_timeout_min → recover (reset to idle, log warning)
- If
cycle.status == running AND not stale → someone legit is working, but we couldn't get flock... (this case shouldn't happen since we're inside flock — it's a belt-and-suspenders check)
Lease Renewal
Long-running actions update last_heartbeat_at at sub-step boundaries:
implement_task:
1. Update last_heartbeat_at → NOW (before dispatching to Codex)
2. Dispatch to gpt-5.2-codex (may take 20+ min for high reasoning)
3. Update last_heartbeat_at → NOW (after Codex returns)
4. Read git state, update STATE.yaml
Cycle Protocol
When triggered (by cron or manual), execute these steps in order:
Step 0: GUARD
- Acquire
flock -n on <project>/.deadf/cycle.flock
- If cannot acquire → exit silently (another session owns the cycle)
- Read
cycle.status from STATE.yaml:
idle → proceed to Step 1
running + stale (now - last_heartbeat_at > stale_timeout_min) → log recovery, reset to idle, proceed
running + not stale → release flock, exit (shouldn't happen inside flock, but safety check)
complete/failed → reset to idle, proceed
- Check
phase:
needs_human → post alert to Discord, release flock, exit
complete → post completion summary, release flock, exit
- Any other → proceed
Step 1: LOAD
Read these files:
STATE.yaml — current pipeline state
POLICY.yaml — mode behavior, thresholds, heartbeat config
OPS.md — project-specific build/test/run commands (if present)
task.files_to_load — files listed in STATE task.files_to_load (cap: <3000 tokens)
Step 2: VALIDATE
- Parse STATE.yaml. If unparseable →
phase: needs_human, post alert, exit.
- Generate cycle_id:
cycle-<iteration+1>-<8 random hex chars>
- Derive nonce from cycle_id:
sha256(cycle_id.encode('utf-8')).hexdigest()[:6].upper()
- Format: exactly
^[0-9A-F]{6}$
- Claim cycle — write to STATE.yaml atomically:
cycle:
id: <cycle_id>
nonce: <derived_nonce>
status: running
started_at: <ISO-8601>
session_key: <this session's key>
last_heartbeat_at: <ISO-8601>
- Check budgets:
- Time:
now - budget.started_at >= max_hours → phase: needs_human, alert, exit
- Budget 75%: → warn per POLICY
- Iterations:
iteration >= 200 → phase: needs_human, alert, exit
Step 3: DECIDE
Read phase and task.sub_step. The action is deterministic — first matching row wins:
| # | Phase | Condition | Action |
|---|
| 1 | Any | Budget exceeded or state invalid | escalate |
| 2 | execute | stuck_count >= stuck_threshold AND replan_attempted == true | escalate |
| 3 | execute | stuck_count >= stuck_threshold AND replan_attempted == false | replan_task |
| 4 | execute | sub_step: implement + last_result.ok == false + retry_count >= max_retries | rollback_and_escalate |
| 5 | execute | sub_step: implement + last_result.ok == false + retry_count < max_retries | retry_task |
| 6 | research | — | seed_docs |
| 7 | select-track | No track selected | pick_track |
| 8 | select-track | Track selected, no spec | create_spec |
| 9 | select-track | Spec exists, no plan | create_plan |
| 10 | execute | sub_step: null or generate | generate_task |
| 11 | execute | sub_step: implement | implement_task |
| 12 | execute | sub_step: verify | verify_task |
| 13 | execute | sub_step: reflect | reflect |
| 14 | complete | — | summarize |
One cycle = one action. No chaining.
Step 4: EXECUTE
Run the determined action. See Action Specifications.
Step 5: RECORD
Update STATE.yaml atomically (see State Write Protocol):
cycle.status: complete or failed
cycle.finished_at: ISO-8601
loop.iteration: always increment (even on failure)
last_action: action name
last_result: outcome details
- Action-specific fields per action spec
Step 6: REPORT
Post one-liner to Discord pipeline channel:
{emoji} #{iteration} | {action} | {project}:{task_id} | {details} | → {next_step}
Examples:
✅ #47 | generate_task | mnemo:tui-09 | TASK.md written | → implement
❌ #49 | verify_task | mnemo:tui-09 | FAIL: 2 tests broken | retry 1/3
🚨 #55 | escalate | mnemo:api-03 | 3x fail, rolled back | needs_human
🏁 #103 | summarize | mnemo | PROJECT COMPLETE: 5 tracks, 38 tasks
Step 7: RELEASE
- Set
cycle.status: idle (if not already set to failed/needs_human)
- flock released automatically on session exit
Action Specifications
seed_docs (research phase)
- Read project files, understand codebase structure
- Generate VISION.md and ROADMAP.md (consult GPT-5.2 via MCP)
- Set
phase: select-track
pick_track (select-track phase)
- Consult GPT-5.2 planner (via MCP) to select next track from
tracks_remaining
- Set
track.id, track.name, track.status: in-progress
create_spec / create_plan (select-track phase)
- Consult GPT-5.2 planner (via MCP) for track spec/plan
- Parse output with
extract_plan.py --nonce <nonce>
- On plan complete: set
phase: execute, task.sub_step: generate
generate_task (execute phase)
- Dispatch to GPT-5.2 via MCP with layered prompt (orientation → objective → rules → guardrails)
- Parse with
extract_plan.py --nonce <nonce>
- Write TASK.md from parsed plan
- Update:
task.id, task.description, task.sub_step: implement, task.files_to_load
- On parse failure after retry:
CYCLE_FAIL
implement_task (execute phase)
- Idempotency check: If
git log --oneline -1 matches current task ID → skip, set sub_step: verify
- Update
last_heartbeat_at (lease renewal before long operation)
- Dispatch to gpt-5.2-codex via MCP:
codex-mcp-call.sh --model gpt-5.2-codex --full-auto --cwd <project> "<implementation prompt>"
For complex tasks: use --model gpt-5.2-codex-high
- Update
last_heartbeat_at (lease renewal after return)
- Read results from git:
commit_hash=$(git -C <project> rev-parse HEAD)
files_changed=$(git -C <project> diff HEAD~1 --name-only)
diff_lines=$(git -C <project> diff HEAD~1 --stat)
- On success (new commit): set
task.sub_step: verify
- On failure (no commit): set
last_result.ok: false
verify_task (execute phase)
Stage 1: Deterministic
cd <project> && ./verify.sh
Output: JSON with pass, checks, failures. If pass == false → FAIL immediately.
Stage 2: LLM (only if Stage 1 passes)
DET: prefixed criteria → auto-pass (covered by verify.sh)
LLM: prefixed criteria → spawn one sessions_spawn sub-agent per criterion
- Each sub-agent produces a sentinel verdict block
Stage 3: Combined
echo '<responses_json>' | python3 build_verdict.py --nonce <nonce> --criteria AC1,AC2,...
| verify.sh | LLM | Result |
|---|
| FAIL | (not run) | FAIL |
| PASS | FAIL | FAIL |
| PASS | NEEDS_HUMAN | pause |
| PASS | PASS | PASS |
| PASS | parse failure | NEEDS_HUMAN |
On PASS: sub_step: reflect, update last_cycle.*, last_result.ok: true
On FAIL: increment retry_count, sub_step: implement, last_result.ok: false
reflect (execute phase)
- Update baselines:
last_good.commit: <HEAD>
last_good.task_id: <current task>
last_good.timestamp: <now>
- Advance: more tasks →
sub_step: generate | track done → phase: select-track | all done → phase: complete
- Reset:
retry_count: 0, stuck_count: 0, replan_attempted: false
retry_task (execute phase)
- Set
sub_step: implement
- Include failure context in next implementation prompt
replan_task (execute phase — stuck recovery)
- Set
replan_attempted: true, reset stuck_count: 0, retry_count: 0
- Set
sub_step: generate (regenerate task from scratch)
rollback_and_escalate (execute phase)
git stash
git checkout -b rescue-{run_id}-{task_id}
git checkout main
git reset --hard {last_good_commit}
Set phase: needs_human, post alert.
summarize (complete phase)
Post completion summary to Discord. Set phase: complete.
escalate (any phase)
Set phase: needs_human. Post alert with context.
Sentinel DSL
Plan Block (GPT-5.2 → extract_plan.py)
<<<PLAN:V1:NONCE={nonce}>>>
TASK_ID=auth-01-03
TITLE="Implement JWT refresh token rotation"
SUMMARY=
Multi-line description here.
FILES:
- path=src/auth/jwt.ts action=modify rationale="Add refresh logic"
ACCEPTANCE:
- id=AC1 text="DET: All tests pass"
- id=AC2 text="LLM: Auth module exports refresh() method"
ESTIMATED_DIFF=120
<<<END_PLAN:NONCE={nonce}>>>
Verdict Block (Sub-agent → build_verdict.py)
<<<VERDICT:V1:AC1:NONCE={nonce}>>>
ANSWER=YES
REASON="Criterion met: endpoint returns both tokens."
<<<END_VERDICT:AC1:NONCE={nonce}>>>
Nonce Derivation
sha256(cycle_id.encode('utf-8')).hexdigest()[:6].upper()
- Format:
^[0-9A-F]{6}$
- Same nonce for entire cycle
Rules
- One block per LLM response
- Sentinels on their own line
- Open nonce == close nonce == expected nonce
- One format-repair retry, then CYCLE_FAIL/NEEDS_HUMAN
State Management
STATE.yaml — Single Source of Truth
project: mnemo
phase: execute
mode: yolo
_run_id: "run-2026-01-30-a1b2c3d4"
cycle:
status: idle
id: null
nonce: null
started_at: null
finished_at: null
session_key: null
last_heartbeat_at: null
loop:
iteration: 0
stuck_count: 0
track:
id: null
name: null
status: null
tasks_total: 0
task_current: 0
tracks_remaining: []
tracks_completed: []
task:
id: null
description: null
sub_step: null
retry_count: 0
max_retries: 3
replan_attempted: false
files_to_load: []
last_action: null
last_result:
ok: null
details: null
last_good:
commit: null
task_id: null
timestamp: null
last_cycle:
commit_hash: null
test_count: null
diff_lines: null
budget:
started_at: null
max_hours: 24
POLICY.yaml — Mode + Heartbeat Config
modes:
yolo:
description: "Full autonomy. Pause only on stuck/failure."
notifications:
track_complete: silent
task_complete: silent
stuck: pause
triple_fail_rollback: pause
budget_75_percent: warn
complete: summary
approvals:
new_track: false
task_start: false
hybrid:
description: "Autonomous with human checkpoints at track boundaries."
notifications:
track_complete: notify
new_track_starting: notify
task_complete: silent
stuck: pause
triple_fail_rollback: pause
budget_75_percent: warn
complete: summary
approvals:
new_track: true
task_start: false
interactive:
description: "Human approves each task."
notifications:
track_complete: notify
new_track_starting: notify
task_complete: notify
stuck: pause
triple_fail_rollback: pause
budget_75_percent: warn
complete: summary
approvals:
new_track: true
task_start: true
escalation:
stuck_threshold: 3
max_retries: 3
max_iterations: 200
max_hours: 24
heartbeat:
enabled: true
cycle_interval_min: 3
stale_timeout_min: 45
lease_renewal: true
discord_channel: null
status_format: oneliner
rollback:
authority: clawdbot
trigger: "task.retry_count >= task.max_retries"
verification:
format_repair_retries: 1
State Write Protocol
ALL STATE.yaml writes use flock + atomic rename:
(
flock -w 10 9 || { echo "FLOCK_FAIL"; exit 70; }
tmp=$(mktemp "STATE.yaml.tmp.XXXXXX")
yq --arg v "$value" ".$field = \$v" STATE.yaml > "$tmp"
mv -f "$tmp" STATE.yaml
) 9>"$PROJECT/.deadf/cycle.flock"
Owner verification: Before any state update during a cycle, assert cycle.session_key == this session. If mismatch → abort (another session took over).
Model Dispatch
| Purpose | Method | Notes |
|---|
| Planning | codex-mcp-call.sh --model gpt-5.2 --sandbox danger-full-access | No timeout. Let it think. |
| Implementation | codex-mcp-call.sh --model gpt-5.2-codex --full-auto --cwd <project> | Full filesystem access |
| Complex impl | codex-mcp-call.sh --model gpt-5.2-codex-high --full-auto --cwd <project> | Parsers, state machines |
| LLM Verification | sessions_spawn (Opus 4.5 sub-agent) | One per LLM: criterion |
| Orchestration | This session (Opus 4.5) | Reads skill, follows protocol |
Wrapper script: /tank/dump/AGENTS/junior/scripts/codex-mcp-call.sh
Critical: Never set timeouts on GPT-5.2 calls. It thinks slowly. That's by design.
Stuck Detection
| Trigger | Condition | Action |
|---|
| Stuck (first) | stuck_count >= stuck_threshold + replan_attempted == false | replan_task |
| Stuck (after replan) | stuck_count >= stuck_threshold + replan_attempted == true | escalate |
| Budget time | elapsed >= max_hours | escalate |
| Budget iterations | iteration >= 200 | escalate |
| 3x task failure | retry_count >= max_retries | rollback_and_escalate |
| State invalid | Unparseable STATE.yaml | escalate |
| Stale session | now - last_heartbeat_at > stale_timeout_min | Auto-recover, log warning |
Safety Constraints
- Never write source code — delegate to gpt-5.2-codex via MCP
- Never override verifier verdicts — verify.sh FAIL = FAIL, period
- Deterministic wins — verify.sh always takes precedence over LLM judgment
- Conservative default — verify.sh PASS + LLM FAIL = FAIL
- One cycle = one action — never chain
- Atomic state updates — flock + temp + mv
- Nonce integrity — every sentinel parse uses the cycle's nonce
- Owner verification — assert session_key before every state write
- No secrets in files — ever
- Escalate when uncertain —
needs_human is always safe
Project Structure
<project>/
├── .deadf/
│ ├── logs/ # Cycle logs (auto-rotated, max 50)
│ └── cycle.flock # Filesystem lock (replaces ralph.lock)
├── STATE.yaml # Pipeline state
├── POLICY.yaml # Mode + heartbeat config
├── OPS.md # Project-specific build/test commands
├── VISION.md # What we're building
├── ROADMAP.md # How we get there
├── TASK.md # Current task spec
├── extract_plan.py # Sentinel plan parser
├── build_verdict.py # Sentinel verdict parser
├── verify.sh # Deterministic verifier
└── src/, tests/, etc. # Actual project code
Getting Started
Initialize a New Project
-
Create project directory:
mkdir -p /tank/dump/DEV/<project>
cd /tank/dump/DEV/<project>
git init && mkdir -p .deadf/logs
-
Copy pipeline files:
cp /tank/dump/DEV/deadfish-pipeline/{extract_plan.py,build_verdict.py,verify.sh,POLICY.yaml} .
chmod +x verify.sh
-
Create STATE.yaml:
project: <project-name>
phase: research
mode: yolo
_run_id: "run-$(date +%Y-%m-%d)-$(head -c4 /dev/urandom | xxd -p)"
cycle:
status: idle
loop:
iteration: 0
stuck_count: 0
budget:
started_at: "<ISO-8601 now>"
max_hours: 24
-
Configure POLICY.yaml heartbeat section:
heartbeat:
enabled: true
cycle_interval_min: 3
stale_timeout_min: 45
discord_channel: "channel:<your-pipeline-channel-id>"
-
Commit initial state:
git add STATE.yaml POLICY.yaml extract_plan.py build_verdict.py verify.sh
git commit -m "init: deadf(ish) pipeline v3.0.0"
Activation
Create the cron job to start the pipeline:
cron add:
name: "deadfish-<project>"
schedule: "*/3 * * * *" (every 3 min)
sessionTarget: isolated
payload:
message: "🐟 DEADFISH CYCLE: Project '<project>' at /tank/dump/DEV/<project>/
Read the deadfish skill, then execute ONE pipeline cycle.
Acquire flock, read STATE.yaml, run one action, update state, post status."
deliver: true
channel: discord
to: "channel:<pipeline-channel-id>"
Deactivation
cron update: enabled: false
Or set phase: needs_human in STATE.yaml (cron fires but exits immediately).
Resume After needs_human
- Read STATE.yaml (
last_action, last_result.details)
- Fix the issue
- Set
phase back to appropriate value
- Set
cycle.status: idle
- Re-enable cron job (if disabled)
Multiple Projects
Each project gets its own cron job. They run independently — different STATE.yaml, different flock files:
deadfish-mnemo → */3 * * * * → /tank/dump/DEV/mnemo/
deadfish-dealio → */5 * * * * → /tank/dump/DEV/dealio/
Discord Status Format
Per-Cycle One-Liner
✅ #47 | generate_task | mnemo:tui-09 | TASK.md written | → implement
✅ #48 | implement_task | mnemo:tui-09 | 3 files, +87 lines | → verify
❌ #49 | verify_task | mnemo:tui-09 | FAIL: 2 tests broken | retry 1/3
✅ #50 | implement_task | mnemo:tui-09 | retry: fixed assertions | → verify
✅ #51 | verify_task | mnemo:tui-09 | PASS: 4/4 AC met | → reflect
✅ #52 | reflect | mnemo:tui-09 | baseline updated | → generate (tui-10)
Transitions
🎯 Track complete: mnemo:tui (9/9 tasks)
🚀 New track: mnemo:api (6 tasks planned)
🏁 PROJECT COMPLETE: mnemo | 5 tracks, 38 tasks, 103 cycles
Alerts
🚨 STUCK: mnemo:api-03 | 3 cycles no progress | needs_human
🔄 ROLLBACK: mnemo:api-03 | 3x fail | rescue: rescue-run001-api03
⏰ BUDGET 75%: mnemo | 18h / 24h elapsed
⚠️ STALE RECOVERY: session died mid-cycle | auto-recovered
Skill version: 3.0.0-heartbeat — deadf(ish) v2.4.2 adapted for Clawdbot cron-driven execution. 🐟