Multi-session deliverable play for projects spanning 3+ sessions with concrete outputs (proposals, strategies, wireframes). Provides project-level structure, evidence provenance, and cross-session handoff.

2026-05-272

fill-cards

paulyokota/FeedForward

Investigation-driven card grooming — investigate across data sources, synthesize findings into card content, present for approval

2026-05-222

agenterminal-code-reviewer

paulyokota/FeedForward

Use when acting as a reviewer in an AgenTerminal review conversation. Handles both code reviews (REVIEW_APPROVED) and plan reviews (PLAN_APPROVED).

2026-05-212

sync-ideas

paulyokota/FeedForward

Match Slack

2026-05-192

release-review

paulyokota/FeedForward

Weekly release impact review — pull PostHog data for Released cards and tracked PRs, classify, draft observations, post to Slack

2026-05-192

name	task
description	Structured operational hygiene for research, data gathering, and novel tasks that don't have their own play
disable-model-invocation	true

Task

For any task that doesn't have its own play: research, data gathering, one-off analysis, building something new. The task itself is unstructured, but the operational hygiene around it is not.

Arguments

$ARGUMENTS is optional:

{task-name}: Used as the filename for the task brief (box/research/{task-name}-brief.md). If omitted, derive from the task description.

Phases

The play moves through five phases. Set todos at play start and mark each phase transition explicitly.

[ ] Start: task brief created at box/research/{name}-brief.md
[ ] Plan: queries/steps written in task brief before executing
[ ] Execute: intermediate results saved to durable files
[ ] Synthesize: deliverable produced
[ ] Close: code committed, log updated, play promotion evaluated

Phase Details

1. Start

Resuming a prior investigation (.agent/state/active.json exists):

Read active.json for the investigation ID and brief_path
Skip steps 1-2 below — reopen the existing task brief at brief_path
Run python3 box/agent-state.py views --investigation I-NNN to regenerate brief.md from current state (always regenerate on resume regardless of status — this eliminates stale views from interrupted sessions)
Read .agent/views/brief.md for current investigation state (verified vs. tentative claims, open questions, recommended next action)
python3 box/agent-state.py log session_start --investigation I-NNN --session-id $SESSION_ID
Set the five phase todos as usual

Starting a new investigation (no active.json, or starting fresh):

Copy box/research/task-brief-template.md to box/research/{name}-brief.md
Fill in the Objective and Intermediate Files sections
Check reference/tooling-logistics.md for gotchas relevant to your data sources
python3 box/agent-state.py log session_start --session-id $SESSION_ID --brief-path box/research/{name}-brief.md (auto-generates investigation ID)
Set the five phase todos

The task brief is the durable artifact. If context compaction hits, the brief survives. If the session crashes, the next session can resume from the brief.

2. Plan

Write your queries, steps, and merge logic in the task brief before executing any of them. This is the phase gate: nothing runs until the plan is written down.

If you discover during planning that the task shape is unclear, say so. Planning is where ambiguity gets resolved, not mid-execution.

Get user approval on the brief before moving to Execute. The plan is a checkpoint, not just a file. Present it inline or via plan approval and wait for explicit approval before executing.

3. Execute

Save each intermediate result to a durable file as it comes back. The file paths should already be in the task brief from the Plan phase.

Log every data source interaction via the agent memory system:

Each data source read: python3 box/agent-state.py log source_read --source {name} --record-id {id}
Each query: python3 box/agent-state.py log query_run --tool {name} --query "{text}"
Each source skip: python3 box/agent-state.py log source_skipped --source {name} --reason "{why}"
Delegate dispatch/collection: python3 box/agent-state.py log delegate_dispatched|delegate_collected --task-id {id} ...

The --investigation flag is optional when active.json exists. Intermediate results still save to box/research/ files as before — the event log tracks that they were read.

When results look suspicious, stop and investigate. Do not rationalize unexpected data. The tendency is to explain why wrong data is acceptable rather than fix it. If a number looks off, the default is "this is wrong and I need to understand why," not "this is probably fine because..."

If you need to modify code (scripts, client libraries, etc.) as part of execution, note the changes in the task brief's Decisions section.

Context cost of data gathering. Subagent delegation protects main context from volume reading but collect results stay in context for the rest of the session. Use output_instructions to constrain return shape (file paths + key findings, not full excerpts). For direct file reads, use line ranges — Grep for structure first. Every KB here is a KB unavailable for Synthesize, Close, and session-end.

4. Synthesize

⚠ SCOPE GATE. Before writing the deliverable, generate scope accounting from the event log:

python3 box/agent-state.py accounting

Present the machine-derived output to the user. Fill in the agent-supplied sections (plan commitments, additional context) before presenting. Wait for the user to confirm coverage is sufficient before writing.

The machine-derived sections (sources read, queries run, delegates, claim status) are authoritative — they come from the event log, not from memory. The agent-supplied sections are explicitly marked as manual input.

Re-read intermediate files before composing. The data was read earlier but the compose step must work from the files, not from memory of what they said. This activates even when the data feels recent and in-context — "I already read this" is the exact instinct that suppresses the re-read. Proved twice: 2026-03-27 (proposal reading), 2026-04-09 (phase assignments, blocking analysis).

Produce the deliverable. The shape depends on the task — could be a CSV, a report, a card draft, a script.

Log claims in the deliverable. Claims proposed in the deliverable should be logged:

Propose: python3 box/agent-state.py claim propose --body "..." --basis "..." --confidence tentative --source-refs E-NNN ...
Verify (requires existing proposal): python3 box/agent-state.py claim verify --claim C-NNN --method "..." --result "..." --source-refs E-NNN ...

Before finalizing, manifest claims and validate:

python3 box/agent-state.py claim manifest --claims C-NNN ... --deliverable {path}
python3 box/agent-state.py validate

Correctness claims require primary source verification. When the deliverable includes judgments about whether an external system (bot, API, feature) produced a correct or incorrect result, label those judgments as candidates pending verification — not findings — until verified against the system's code, data, or actual behavior. Conversation text, user reports, and error descriptions are intermediaries, not primary sources. Verification is per-claim: verifying one claim does not validate others about different features. Proved 2026-04-16: 5 bot conversations labeled "bot was incorrect" from conversation evidence; codebase verification of 1 found the bot described a real feature applied to the wrong problem.

Writing to large target files. Edit returns the full file as confirmation. For end-of-file appends of approved content, Bash cat >> avoids the echo-back.

Compaction insurance. Use approve_content for composed prose going to production surfaces (card descriptions, analysis comments, observations, email drafts, findings messages). Formulaic mutations (tracked replies, "This shipped!" replies, reactions, link-back lines, state changes) go through execute_approved only. Saved files at .agenterminal/approved/{content_type}/ survive compaction. After compaction, read the saved file to recover approved text. See play-specific approval points in box/shortcut-ops.md for content_type and filename conventions per play.

5. Close

⚠ NARRATION GATE. Close-phase steps feel procedural after the deliverable ships — completion bias activates here reliably. Before EACH step below: say one sentence to the user about what you are doing and wait for acknowledgment before tool calls that commit, write log entries, or update reference docs. Proved 2026-04-15: commit to main + two log entries + tooling update executed as autonomous stream, user had to interrupt.

End the investigation session and regenerate views:

python3 box/agent-state.py log session_end --session-id $SESSION_ID
python3 box/agent-state.py views

Commit any code changes made during the task
Update the investigation log if there were operational learnings: python3 box/log-cli.py write --date ... --topic ... --lesson ... --bullet ...
Play promotion: Was this a one-off, or did we just discover a repeatable pattern? If the task is likely to recur, the artifacts from this run (the task brief, the intermediate steps, the gotchas encountered) are the seed for a new dedicated play. Draft the play definition and propose adding it to the index.

When NOT to Use

This play is for tasks that hit external data sources and produce durable artifacts: CSVs, reports, scripts. It is not for conversation-shaped work like process design, brainstorming, or retrospectives, where the conversation itself is the deliverable. The user triggers this play explicitly; don't self-activate based on the word "novel."

What This Play Does NOT Prescribe

What data sources to use
How to structure the investigation
What the deliverable looks like
Methodology or analysis approach

Those flex to the task. The phases and their gates don't.

task

Task

Arguments

Also Read

Phases

Phase Details

1. Start

2. Plan

3. Execute

4. Synthesize

5. Close

When NOT to Use

What This Play Does NOT Prescribe

Task

Arguments

Also Read

Phases

Phase Details

1. Start

2. Plan

3. Execute

4. Synthesize

5. Close

When NOT to Use

What This Play Does NOT Prescribe