// Create educational documents that build understanding progressively with citations
| name | research-document |
| description | Create educational documents that build understanding progressively with citations |
A systematic approach to creating educational documentation through tiered research.
Creates in-depth explanatory documents that:
For quick factual lookups, use fact-find instead.
Research happens in tiers, not all at once. All artifacts go to the filesystem.
┌─────────────────────────────────────────────────────────────────┐
│ TIERED RESEARCH MODEL │
└─────────────────────────────────────────────────────────────────┘
TIER 0: SCOUT TIER 1: FOCUSED RESEARCH
───────────────── ────────────────────────
┌─────────────┐ ┌─────────────┐
│ Initial │ │ Sub-question│──▶ Cited answer
│ Question │ │ (fact-find) │ (short)
└──────┬──────┘ └─────────────┘
│ ┌─────────────┐
▼ │ Sub-question│──▶ Cited answer
┌─────────────┐ Decompose │ (fact-find) │ (short)
│ Scout │────────────────▶ └─────────────┘
│ Search │ ┌─────────────┐
└──────┬──────┘ │ Sub-question│──▶ RECURSE
│ │ (research) │ (own scout/decompose)
▼ └─────────────┘
┌─────────────┐ │
│ Write to: │ ▼
│ planning/ │ ┌─────────────┐
│ <slug>/ │ │ Assemble │──▶ Draft document
│ 00-scout.md │ │ from files │
└─────────────┘ └──────┬──────┘
│
▼
┌─────────────┐
│ Verify │──▶ Final document
│ citations │
└─────────────┘
Why this works:
All research artifacts go to $FOREST_ROOT/planning/<question-slug>/:
$FOREST_ROOT/planning/
└── how-twoliter-builds-kits/
├── 00-scout.md # Scout findings + sub-questions
├── 01-kit-structure.md # Sub-question answer (fact-find)
├── 02-build-command.md # Sub-question answer (fact-find)
├── 03-buildsys/ # Sub-question that needed recursion
│ ├── 00-scout.md
│ ├── 01-spec-parsing.md
│ └── 02-docker-build.md
└── FINAL.md # Assembled document
Slug format: lowercase, hyphens, descriptive (e.g., how-twoliter-builds-kits)
If you have todolist functionality, use it to enforce workflow completion:
Before each phase, enumerate the work in your plan:
The pattern: Enumerate → Execute → Confirm count matches
This prevents "doing some" instead of "doing all." The plan is a contract—once you've written "Verify citation [7]" you're committed to doing it.
Execute tiers sequentially, using the filesystem as your "memory":
00-scout.md01-*.md, 02-*.md, etc.FINAL.mdThis allows context window resets between phases if needed.
Goal: Understand what you're dealing with. Write findings to 00-scout.md.
mkdir -p $FOREST_ROOT/planning/<question-slug>
# Broad search to find relevant areas
(cd $FOREST_ROOT && sembly search "system-name overview")
(cd $FOREST_ROOT && sembly search "system-name architecture")
Read 2-3 top results. Capture in 00-scout.md:
# Scout: <Original Question>
## Key Concepts Discovered
- [Concept 1]: [Brief description]
- [Concept 2]: [Brief description]
## Relevant Files Found
- `path/to/file.md` - [What it covers]
- `path/to/code.rs` - [What it covers]
## Terminology
- [Term]: [Definition as used in this codebase]
## Sub-Questions
### 1. [Sub-question text]
- **Type:** fact-find | research-document
- **Why:** [Why this classification]
- **Key files:** [Files likely to answer this]
### 2. [Sub-question text]
...
Sub-question classification:
| If the sub-question... | Type | Action |
|---|---|---|
| Has a concrete, specific answer | fact-find | Answer in 1-2 paragraphs |
| Asks "what is X" or "where is Y" | fact-find | Answer in 1-2 paragraphs |
| Asks "how does X work" | research-document | Recurse (own scout/decompose) |
| Involves multiple components interacting | research-document | Recurse |
| Would need 3+ source files to answer | research-document | Recurse |
Recursion check: If more than 2 sub-questions are type research-document, consider whether the original question is too broad.
Depth limit: Maximum recursion depth is 2 levels. If a sub-sub-question would need its own recursion, either:
For each sub-question, write to NN-<slug>.md:
For fact-find sub-questions:
# <Sub-Question>
<Direct answer with inline citations>
The kit directory must contain a `Twoliter.toml` file <sup>[1]</sup> and a `Cargo.toml`
that lists packages as dependencies <sup>[2]</sup>.
## Sources
<sup>[1]</sup> [`twoliter/README.md`](../twoliter/README.md) - Kit requirements section
<sup>[2]</sup> [`kits/bottlerocket-core-kit/Cargo.toml`](../kits/bottlerocket-core-kit/Cargo.toml) - Example kit manifest
For research-document sub-questions:
Create a subdirectory and recurse:
$FOREST_ROOT/planning/how-twoliter-builds-kits/03-buildsys/
├── 00-scout.md
├── 01-....md
└── FINAL.md
The sub-question's FINAL.md becomes the answer.
If you can't answer from sources:
Read all sub-question answers from the workspace. Combine into FINAL.md:
# [System Name]
**Keywords:** keyword1, keyword2, keyword3
## Overview
[Visual summary - diagram or table showing the whole system]
[1-2 sentences of context]
## How [Underlying Model] Works
[Synthesize from sub-question answers about fundamentals]
## [Main Topic]
[Synthesize from sub-question answers about the core process]
### [Subtopic]
**Goal:** [One sentence]
[Content with citations carried forward from sub-questions]
## Appendix: [Detailed Reference]
[Dense details, tables, configuration options]
## Conflicts & Resolutions
[If sources disagreed, document how you resolved it]
- [Topic]: Source A said X, Source B said Y. Resolution: [reasoning]
(Write "None" if no conflicts found)
## Sources
[Consolidated numbered citations from all sub-questions]
Overview section:
Tables vs prose:
Citations:
<sup>[1]</sup> inline with factsLong documents (output token limits):
Claude has output limits (~4K-8K tokens per response). For documents longer than ~10 pages:
Goal: Confirm citations support their claims. Catch unsupported statements.
Before spawning any verifiers, you MUST enumerate all citations in your plan/todolist:
Do NOT proceed to spawning until your plan has N+1 items (N citations + confirmation).
This prevents the failure mode of "verifying a few representative citations" instead of all of them.
For Multi-Agent Systems:
Spawn lightweight verifiers with pre-loaded context. Each verifier needs NO tools - just reasoning.
Citation Verifier (spawn one per citation):
Prompt:
You are a skeptical reviewer. Your job is to find problems, not confirm correctness.
CLAIM: "The build process starts by loading Twoliter.toml"
CITED SOURCE: twoliter/twoliter/src/project/mod.rs
SOURCE CONTENT:
[paste relevant lines from the file]
Does the source content ACTUALLY support this specific claim?
Reply with one of:
- SUPPORTED: [brief explanation of what in the source confirms this]
- UNSUPPORTED: [what's missing or wrong]
- PARTIAL: [what's confirmed vs what's not]
Be skeptical. If the source is tangentially related but doesn't directly
state the claim, that's PARTIAL or UNSUPPORTED.
Uncited Claim Detector (spawn once per document section):
Prompt:
You are a skeptical reviewer. Your job is to find problems.
SECTION:
[paste section text]
Find factual claims that lack a <sup>[N]</sup> citation.
Factual claims include: file paths, behavior descriptions,
configuration values, component names, process steps.
Opinions, transitions, and summaries don't need citations.
List each uncited claim you find. Do NOT say "looks good" or "none found"
unless you've checked every sentence. If you find nothing, explain what
you checked to confirm there are no uncited claims.
Batch these: Use spawn_batch to run all citation verifiers in parallel. The batch size MUST equal the citation count from Step 1.
After verifiers complete:
If counts don't match, you skipped citations. Go back and verify the missing ones.
Fix any UNSUPPORTED or PARTIAL before finalizing.
Self-check is less reliable but still valuable. Use the adversarial framing:
Note: Agents checking their own work tend to confirm it. The filesystem-based workflow helps - you can reset context and review with fresh eyes.
Before finalizing:
End your document with: