| name | megabrain |
| description | Feature planning and implementation with 4 adaptive phases — Specify, Design, Tasks, Execute. Auto-sizes depth by complexity. Creates atomic tasks with verification criteria, atomic git commits, and requirement traceability. Features an independent Verifier (author != verifier, evidence-or-zero), persistent decision log (STATE.md), and test-coverage-matrix-driven tests, plus a self-improving lessons layer that turns verification failures into reusable project-local guidance. Stack-agnostic. Use when (1) Planning features (requirements, design, task breakdown), (2) Implementing with verification and atomic commits, (3) Validating or verifying an implementation against a spec. Triggers on "specify feature", "discuss feature", "design", "tasks", "implement", "validate", "verify work", "UAT", "record decision", "pause work", "resume work". Do NOT use for architecture decomposition analysis (use architecture skills) or technical design docs (use create-technical-design-doc). |
| license | CC-BY-4.0 |
| metadata | {"author":"Felipe Rodrigues - github.com/felipfr","version":"3.1.0"} |
Tech Lead's Club - Spec-Driven Development
Plan and implement features with precision. Granular tasks. Clear dependencies. Right tools. Zero ceremony.
┌──────────┐ ┌──────────┐ ┌─────────┐ ┌─────────┐
│ SPECIFY │ → │ DESIGN │ → │ TASKS │ → │ EXECUTE │
└──────────┘ └──────────┘ └─────────┘ └─────────┘
required optional* optional* required
* Agent auto-skips when scope doesn't need it
Critical Rules (read before acting)
Loading this skill's files. Reference files live under references/ in this skill's own directory (where this SKILL.md resides). Resolve them relative to the skill directory — never the workspace root — and load them through the active skill by name; never assume a fixed install path. When a step tells you to read a reference, read it completely (to EOF) before acting — never act on a partial/truncated read.
Execution contract — every task, non-negotiable (holds even if you do not open the reference files):
- Tests derive from the spec's acceptance criteria and assert spec-defined outcomes — they never mirror the implementation.
- The gate must pass (tests pass) before a task is done — the test runner decides, not self-assessment.
- One atomic commit per task. Never batch tasks; never weaken, skip, or delete tests to make them pass.
- After the LAST task, a fresh Verifier always runs automatically (author ≠ verifier) — spec-anchored outcome check + discrimination sensor. It is never optional and never prompted. See Sub-Agent Delegation.
Before Execute: read implement.md completely; if a formal tasks.md has more than 3 phases, present the sub-agent offer first (see Sub-Agent Delegation).
Auto-Sizing: The Core Principle
The complexity determines the depth, not a fixed pipeline. Before starting any feature, assess its scope and apply only what's needed:
| Scope | What | Specify | Design | Tasks | Execute |
|---|
| Small | ≤3 files, one sentence | One-liner spec (inline) | Skip | Skip | Implement + verify inline |
| Medium | Clear feature, <10 tasks | Spec (brief) | Skip — design inline | Skip — tasks implicit | Implement + verify |
| Large | Multi-component feature | Full spec + requirement IDs | Architecture + components | Full breakdown + dependencies | Implement + verify per task |
| Complex | Ambiguity, new domain | Full spec + discuss gray areas | Research + architecture | Breakdown + parallel plan | Implement + interactive UAT |
Rules:
- Specify and Execute are always required — you always need to know WHAT and DO it
- Design is skipped when the change is straightforward (no architectural decisions, no new patterns)
- Tasks is skipped when there are ≤3 obvious steps (they become implicit in Execute)
- Discuss is triggered within Specify when the agent detects ambiguous gray areas that need user input, or when the feature has any implicit-requirement dimension present (persistence/state, external calls, auth, payments, concurrency, state transitions)
- Interactive UAT is triggered within Execute only for user-facing features with complex behavior
Safety valve: Even when Tasks is skipped, Execute ALWAYS starts by listing atomic steps inline (see implement.md). If that listing reveals >5 steps or complex dependencies, STOP and create a formal tasks.md — the Tasks phase was wrongly skipped.
.specs Structure
.specs/
├── STATE.md # Project memory: Decisions log (AD-NNN) + Handoff snapshot
├── LESSONS.md # Self-improving lessons playbook (rendered by scripts/lessons.py — do not hand-edit)
├── lessons.json # Canonical lessons state (machine-owned)
└── features/ # Feature specifications
└── [feature]/
├── spec.md # Requirements with traceable IDs
├── context.md # User decisions for gray areas (only when discuss is triggered)
├── design.md # Architecture & components (only for Large/Complex)
├── tasks.md # Atomic tasks with verification (only for Large/Complex)
└── validation.md # Verifier report: PASS/FAIL, per-AC evidence, sensor result, diff range
Workflow
New feature:
- Specify → (Design) → (Tasks) → Execute (depth auto-sized)
Resume work:
Read .specs/STATE.md — Handoff section for in-flight state, Decisions section to re-confirm active constraints — then propose the next step.
Context Loading Strategy
On-demand load (only what the current task needs):
.specs/STATE.md — Decisions section (read at Design, re-read on resume); Handoff section (read on resume only)
- confirmed lessons — load at Specify and Design via
python3 scripts/lessons.py list --status confirmed (lessons.md); confirmed only, never candidates
- spec.md (when working on a specific feature)
- context.md (when designing or implementing from user decisions)
- design.md (when implementing from design)
- tasks.md (when executing tasks)
Never load simultaneously:
- Multiple feature specs
- Multiple architecture docs
Target: <40k tokens total context
Reserve: 160k+ tokens for work, reasoning, outputs
Monitoring: Display status when >40k (see context-limits.md)
Sub-Agent Delegation
Trigger: >3 phases → offer one worker per phase (sequential); ≤3 phases → execute inline.
Offer-then-confirm — never auto-spawn. The user must accept before any sub-agent is dispatched.
One worker per phase: Each phase worker executes all its tasks in order (implement → gate → atomic commit), then reports a compact summary (tasks done, commit hashes, test counts, deviations). Workers never spawn further sub-agents.
Verifier (always-on, never prompted): After the final task is committed, the orchestrator dispatches a fresh Verifier sub-agent automatically — regardless of phase count. Validation never requires a user prompt; it is the closing step of Execute. Author ≠ verifier: the Verifier re-derives coverage independently using evidence-or-zero; it does not inherit the author's mental model. The Verifier: (1) performs a spec-anchored outcome check — confirms each test's asserted value matches the spec-defined expected outcome, flags spec-precision gaps; (2) runs a discrimination sensor — injects behavior-level faults in scratch state, confirms tests kill them, discards mutations, surviving mutants become fix tasks; (3) writes .specs/features/[feature]/validation.md (PASS/FAIL, per-AC evidence, sensor result, diff range); (4) returns a compact verdict + ranked gap list to the orchestrator in chat. Gaps become fix tasks; the fix→re-verify loop is bounded to 3 iterations before escalating. (5) distills lessons — turns each grounded failure (surviving mutant, spec-precision gap, failed AC, SPEC_DEVIATION) into a reusable project-local lesson via scripts/lessons.py; a clean PASS records nothing (see lessons.md).
Standalone fallback: Without sub-agents, run validate.md as an independent fresh-eyes pass after the final commit — including the spec-anchored check and discrimination sensor.
Full mechanics (worker payload, compact summary format, failure handling, context sizing, Verifier report format): sub-agents.md.
Commands
Feature-level (auto-sized):
| Trigger Pattern | Reference |
|---|
| Specify feature, define requirements | specify.md |
| Discuss feature, capture context, how should this work | discuss.md |
| Design feature, architecture | design.md |
| Break into tasks, create tasks | tasks.md |
| Implement task, build, execute | implement.md |
| Validate, verify, test, UAT, walk me through it | validate.md |
Memory:
| Trigger Pattern | Reference |
|---|
| Record decision, this is a project-level decision | memory.md |
| Pause work, end session, I need to stop | memory.md |
| Resume work, continue, pick up where we left off | memory.md |
| Load lessons, what have we learned, apply past lessons | lessons.md |
| Record lesson, distill lessons (auto-runs after validation) | lessons.md |
Knowledge Verification Chain
When researching, designing, or making any technical decision, follow this chain in strict order. Never skip steps.
Step 1: Codebase → check existing code, conventions, and patterns already in use
Step 2: Project docs → README, docs/, inline comments, `.specs/STATE.md` (Decisions)
Step 3: Context7 MCP → resolve library ID, then query for current API/patterns
Step 4: Web search → official docs, reputable sources, community patterns
Step 5: Flag as uncertain → "I'm not certain about X — here's my reasoning, but verify"
Rules:
- Never skip to Step 5 if Steps 1-4 are available
- Step 5 is ALWAYS flagged as uncertain — never presented as fact
- NEVER assume or fabricate. If you cannot find an answer, say "I don't know" or "I couldn't find documentation for this". Inventing APIs, patterns, or behaviors causes cascading failures across design → tasks → implementation. Uncertainty is always preferable to fabrication.
Output Behavior
Model guidance: After completing lightweight tasks (validation, feature-level checks), naturally mention once per session that such tasks work well with faster/cheaper models. For heavy tasks (complex design, large features), briefly note the reasoning requirements before starting.
Be conversational, not robotic. Don't interrupt workflow—add as a natural closing note. Skip if user seems experienced or has already acknowledged the tip.
Code Analysis
Use available tools with graceful degradation. See code-analysis.md.