| name | wiki-aggregate |
| description | Use when you have N≥3 raw research artifacts (notes, podcast summaries, deep-research dumps, daily intel, paper analyses) on one topic and want to lift them into a single structured pack with cross-source claims and provenance — instead of one-shot summarization that loses 90% of intermediate evidence. Treats the N sources as an environment a lite aggregator agent navigates with `inspect` / `search` / `synthesize` tools, rather than concatenating into one prompt. |
Wiki Aggregate — N raw sources → 1 structured pack
A protocol for agentic aggregation of long-horizon research material. Inverts the standard "concat all → ask LLM to summarize" pipeline: instead, an aggregator agent navigates the N source files with three tools, building a notes scratchpad with full path:line provenance, and finally writes a structured pack (brief / findings / sources / aggregation log).
Core principle: don't read everything upfront. Don't merge final answers. Treat the N sources as a queryable environment.
Why this exists
Three traditional ways to aggregate N parallel research outputs all fail on long-horizon, open-ended tasks:
❌ concat all sources into one prompt
→ 200K+ token explosion, attention collapse on long context
❌ summarize each, then merge summaries
→ ~90% of intermediate evidence (the "I noticed X but..." asides) is lost
❌ LLM-as-judge picks the single best source
→ discards the other N-1 sources' independent findings
These failure modes show up clearly on open-ended research tasks where there's no ground-truth verifier. The alternative: treat the N sources as an environment, send a lite agent in to inspect / search / synthesize on demand. Cost ≈ a single rollout, recall is materially higher, and cross-source contradictions get surfaced explicitly.
This skill is the protocol. No Python, no MCP — pure markdown protocol that any harness with Read + Grep can execute.
When to Use
- You have 3-30 raw notes / transcripts / research dumps on one topic, scattered across files
- The current pack on this topic feels thin or you suspect raw evidence wasn't fully lifted
- You want claims with
path:line provenance, not vibes
- You want cross-source contradictions surfaced, not buried
When NOT to Use
- N < 3 sources — protocol overhead exceeds value, just write a summary by hand
- N > 30 — budget will be exhausted; narrow the scope or batch by sub-topic
- Sources are not yet collected — this is a synthesis tool, not a research tool. Run your collectors first.
- You only need a one-shot Q&A — use a reactive
wiki-ask-style skill instead
The Aggregation Loop
Trajectories-as-environment
╔════════════════════════════════════════╗
║ ║
║ src_1 src_2 src_3 ... src_N ║
║ [..] [..] [..] [..] ║
║ [..] [..] [..] [..] ║
║ ║
╚═══════════════════╤════════════════════╝
│
│ not concatenated.
│ not summarized.
│ navigated.
│
▼
┌────────────────────────────────────────┐
│ AGGREGATOR (lite agent) │
│ ┌──────────────────────────────────┐ │
│ │ inspect_file / inspect_section │ │
│ │ search_sources │ │
│ │ cross_pack_check │ │
│ └──────────────────────────────────┘ │
│ │
│ scratch state: │
│ notes = [] # {claim, evidence, │
│ # source, line_ref} │
│ budget = 25 # tool calls │
│ subtopics = derived from skim pass │
│ │
│ loop until: subtopic coverage met, │
│ OR budget = 0, │
│ OR 2 zero-info calls │
└───────────────────┬────────────────────┘
│
▼
┌─────────────────────────────┐
│ pack/ │
│ brief.md │
│ findings.md ← claims │
│ sources.tsv ← S-IDs │
│ _aggregation_log.md │
└─────────────────────────────┘
Process
Phase 1: Scope
- Resolve the source glob. Count N.
- Hard stop if N < 3 — abort and tell the user to either collect more or summarize manually.
- Warn if N > 30 — suggest narrowing by sub-topic or batching across runs.
- If updating an existing pack, locate it and read its current
brief.md + findings.md so you know what already exists.
Phase 2: Cheap-Pass (Skim)
For each source, do one cheap read:
- File ≤ 200 LOC: read the whole file.
- File > 200 LOC: read first 80 lines (frontmatter + intro + section headers).
Build an in-memory source map:
S1 | path/to/source_1.md | what it covers (1-2 lines) | rough_topics
S2 | path/to/source_2.md | ... | ...
This pass costs ~N reads, each bounded. Do not skip — the source map is what makes Phase 3 efficient.
Phase 3: Aggregator Loop (budget = 25 by default)
Tool inventory (use whatever your harness provides — Read, Grep are sufficient):
| Verb | Implementation | When to use |
|---|
inspect_file(path) | Read whole file | Source < 200 LOC and you need full content |
inspect_section(path, line_range) | Read with offset + limit | Drilling into a specific span of a long source |
search_sources(pattern) | Grep over the N source paths only | Finding a keyword / theme across sources |
cross_pack_check(pattern) | Grep over your wider knowledge base, excluding the target pack and the raw sources | Avoiding duplicate claims with existing packs |
Loop discipline:
state.notes = []
state.budget = 25 (or user-specified)
while state.budget > 0:
pick highest-value next action:
drill — a subtopic has a hot lead in one source
cross_search — a claim from S1 should be cross-checked against others
dedup_check — a claim looks novel; verify no existing pack covers it
resolve — two sources disagree; inspect both passages
explore — a subtopic has zero notes after Phase 2; broaden search
DONE — coverage threshold met
record note → {claim, evidence_quote, source_id, line_ref, confidence}
state.budget -= 1
Stopping criteria — declare DONE when ANY holds:
- ≥3 high-confidence notes per subtopic
- Budget exhausted
- Two consecutive tool calls returned no new info
Hard rule: every note MUST have a source_id + line_ref (path + line range).
No provenance, no claim. This is what makes the pack auditable.
Phase 4: Write the Pack
Output location: <pack-name>/. If updating, merge with existing files (preserve original sources for existing claims, add new claims, flag superseded ones).
Files:
-
brief.md — 200-400 word executive overview. Subtopic skeleton. Reading order suggestion.
-
findings.md — claims, one block per finding, grouped under subtopic headers:
## Claim: <one-line claim>
Status: supported | contradicted | uncertain
Confidence: high | medium | low
Sources: S1, S3, S7
Evidence:
- "exact quote or paraphrase" — S1 (path/to/source.md:L120-128)
- "..." — S3 (path/to/other.md:L45-50)
Notes: <optional — e.g., "S3 contradicts S7 on date">
-
sources.tsv — S-ID mapping:
id path type captured_at url_or_origin
S1 path/to/source_1.md podcast-notes 2026-04-12 https://...
S2 path/to/source_2.md daily-intel 2026-04-13 ...
-
_aggregation_log.md — always written. Audit trail:
# Aggregation Log
Date: YYYY-MM-DD
Topic: <topic>
Sources: N=<N>
Tool calls: X / budget Y
Cross-pack overlaps: <list or "none">
Subtopics covered: <list>
Skipped sources (no relevant content): <list>
Stopping criterion triggered: <which one>
Phase 5: Index hint
Append a one-line entry to your pack index (do not trigger a full reindex — that's a different skill's job).
Phase 6: Report
Print:
Pack written: <pack-name>/
Sources processed: N
Aggregator tool calls: X / budget Y
Subtopics: K
Claims extracted: M (high: a, medium: b, low: c)
Cross-pack overlaps: <list or "none">
Sources with low yield: <list>
Suggested next: <reindex command> && <lint command>
Anti-rationalizations
| Excuse the agent will invent | Rebuttal |
|---|
| "I'll just read all N files in Phase 2 to be safe" | That's the V1 mistake this skill exists to fix. Long-context attention degrades; you'll lose information you "read." Stay disciplined: cheap-pass first, drill on demand. |
"Skipping cross_pack_check — it's a small repo" | Repos grow. Duplicate claims accumulate silently. One Grep per novel claim costs almost nothing. |
| "I have a great quote but I don't remember the line number" | Then the note is invalid. Re-Read to get path:L<lines>. No provenance, no claim — refuse to write findings.md if any note is missing. |
| "Only 2 sources matched the glob — I'll proceed anyway" | No. Hard stop at N < 3. Either collect more or write a summary by hand. The protocol overhead is wasted on small N. |
| "All sources got 'low yield' — I'll write findings from my prior knowledge" | No. The pack is supposed to reflect what's in the sources. If yield is low, the brief is empty + log says so. Don't fabricate. |
"I'll skip writing _aggregation_log.md, it's just paperwork" | No. The log is what makes the next run reproducible. It's also the audit trail when someone questions a claim months later. |
Verification
A successfully completed run produces:
If any checkbox fails, the run is incomplete — do not declare DONE.
Related skills in this repo
debug-hypothesis — same disciplined-loop pattern, applied to bug investigation rather than research synthesis
spec-driven-dev — same explicit-exit-criteria philosophy, applied to building software end-to-end