Run any Skill in Manus with one click

retrospective

Structured retrospective after completing a delivery increment or diamond. Captures learning for continuous improvement.

Run Skill in Manus

Overview

Structured retrospective after completing a delivery increment or diamond. Captures learning for continuous improvement.

Install command

npx skills add https://github.com/haabe/mycelium --skill retrospective

Copy and paste this command into Claude Code to install the skill

Source

haabe/mycelium

Stars30

Forks3

UpdatedJune 2, 2026 at 09:05

SKILL.md

readonly

Retrospective

Run after every completed delivery diamond or significant milestone. Source: Forsgren (learning culture).

Preflight: Read target canvas file(s) before any Write/Edit

Hard rule. Before issuing Write or Edit against any .claude/canvas/*.yml, use the Read tool on that file in this session. Claude Code's Read-before-Write check requires the Read tool specifically — cat/head/grep via Bash do NOT satisfy it.

Edit vs Write — different cost profiles (verified 2026-05-14):

Edit (exact-string replacement): Read with limit: 1 satisfies the check at ~50 tokens. State-tracking is per-file, not per-byte — subsequent Edit calls work anywhere in the file. Use this for partial updates against large canvas files (e.g., purpose.yml at 800+ lines).
Write (full replacement): do a full Read first. Write obliterates the file; you should see what you're about to replace. The limit:1 shortcut is not appropriate here.

ID-bearing entries — scan the ID space before assigning (added 2026-05-15, v0.23.19): When adding a new component, opportunity, solution, or any other ID-bearing entry to a canvas file, run a Bash grep first to confirm the next ID in your prefix sequence is actually free:

grep "^  - id: <prefix>-" .claude/canvas/<file>.yml | sort -u

Replace <prefix> with the canvas's ID prefix (comp for landscape, opp for opportunities, sol for solutions, ht for human-tasks, etc.). Then pick the next free integer. validate_canvas.py has a duplicate-ID check (lines 230-239) that catches the failure on CI, but a duplicate can persist in the working tree for days if CI isn't run between edit and discovery — see roadmap-repo corrections.md 2026-05-15 "Duplicate canvas ID created in landscape.yml" for the worked example.

Original failure mode: anti-pattern #7 instance #5, 2026-05-09 — agent conflated Bash head with the Read tool, lost ~14k tokens to a Write-fail → remedial-full-Read → re-Write loop. The limit:1 discipline (graduated 2026-05-14, v0.23.18) prevents the second-order cost where the agent correctly follows the rule but full-Reads every time. The ID-scan discipline (graduated 2026-05-15, v0.23.19) prevents the related class where the agent reads enough of the file to satisfy the Edit check but not enough to see existing ID assignments — kin to anti-pattern #8 (Stale State Read).

If this skill writes to multiple canvas files, register each one first (limit:1 for Edit-only paths; full Read for Write paths) AND ID-scan any prefix you intend to assign.

See CLAUDE.md Canvas writes — Read before Write for the canonical rule.

Workflow

Run these steps IN ORDER. Do not skip any step. Step 1 (cycle recording) MUST be completed FIRST — before any reflective analysis.

Step 1. Record Cycle in `.claude/canvas/cycle-history.yml` AND Decision Log (MANDATORY — DO THIS FIRST)

This step is critical. Without it, the learning metabolism has no data. You MUST do BOTH parts (5a and 5b).

Step 5a. Write cycle record to `.claude/canvas/cycle-history.yml`

Find the leaf_id and opportunity_id for the delivered solution (from .claude/canvas/opportunities.yml or .claude/canvas/gist.yml). Then write a cycle record:

- cycle_id: cycle-NNN
  leaf_id: "opp-XXX-sol-X"         # From opportunities.yml
  opportunity_id: "opp-XXX"         # Parent opportunity
  diamond_id: "d-XXX"               # From .claude/diamonds/active.yml
  completed_at: "YYYY-MM-DDTHH:MM:SSZ"
  outcome: shipped | partial | failed | discarded
  cycle_class: product-leaf | meta-dogfood | observation  # REQUIRED — see engine/cycle-learning.md#cycle-class
  predicted:
    ice_score: {i: X, c: X, e: X, total: XXX}  # REQUIRED non-zero when cycle_class=product-leaf; permitted zero for meta-dogfood/observation (state why in notes)
    feasibility_risk: low | medium | high        # From four_risks
    effort_estimate: "X days/weeks"              # Original estimate
  actual:
    effort: "X days/weeks"                       # How long it actually took
    dora:                                        # From /mycelium:dora-check or known metrics
      deploy_frequency: "..."
      lead_time: "..."
      change_failure_rate: "..."
      mttr: "..."
  calibration:
    ice_accuracy: "predicted XXX vs actual [outcome description]"
    effort_accuracy: "predicted X days vs actual X days (delta: +/-X)"
    risk_accuracy: "feasibility was [predicted] — actual was [description]"
  learnings: "Key learning from this cycle"

Update calibration_summary.total_cycles count. If total_cycles reaches a multiple of 5, prompt: "5 cycles since last review. Run /mycelium:framework-health to check calibration?"

Hard gate on cycle_class: product-leaf: if the cycle being closed shipped an OST solution leaf, predicted.ice_score.total must be non-zero. If it is zero, do NOT write the record yet — stop and ask: "This cycle shipped a product leaf but has no recorded ICE prediction. Was /mycelium:ice-score run before the cycle opened? If yes, copy the score from opportunities.yml. If no, this is a Check 38 violation — class the cycle as meta-dogfood if no design tradeoff was actually scored, or backfill the ICE score with an honest reconstruction noted as reconstructed_post_hoc: true." Reconstructed scores are excluded from calibration aggregates but preserved for the audit trail.

Step 5b. Log cycle calibration summary in .claude/harness/decision-log.md

Write a decision log entry titled "Cycle calibration record" that includes ALL of the following (use these exact words):

cycle number and diamond ID
predicted ICE score and effort estimate (from the original canvas)
actual outcome and effort (from what really happened)
calibration assessment: was the prediction accurate?
effort delta: if the estimate was an underestimate or overestimate, state the accuracy gap (e.g., "effort accuracy: predicted 5 days vs actual 7 days, 40% underestimate")
Risk dimension accuracy (e.g., "feasibility was predicted medium — actual confirmed, analytics pipeline was indeed the hardest part")

This decision log entry ensures the calibration data is auditable alongside other decisions, not just buried in cycle-history.yml.

Step 2. What Went Well?

Which patterns from patterns.md were reused successfully?
What new approaches worked?
Where did the theory gates catch a real problem?

Step 3. What Didn't Go Well?

What mistakes were made? (Add to corrections.md)
Where did we skip a guardrail and regret it?
What took longer than expected and why?

Step 4. What Should Change?

New corrections to add
New patterns to capture
Process improvements
Guardrail adjustments
ADR review (if docs/adr/ exists): did implementation follow the decided approach? Any consequences that turned out differently than expected? Mark superseded ADRs.

Step 5. BVSSH Dimension Check

Better: Did quality improve or degrade?
Value: Did we deliver actual user value?
Sooner: Was our flow efficient?
Safer: Did we maintain security and trust?
Happier: How is team satisfaction? Customer advocacy? Societal impact? Was compute usage proportionate to value (not wasteful)?

Step 6. Rework Follow-Up (14-day window)

If this retrospective is for a cycle completed more than 14 days ago, check:

How many corrections were logged against this delivery since completion? → rework.post_delivery_corrections
How many regressions occurred? → rework.post_delivery_regressions
Days to first regression? → rework.days_to_first_regression

Update the cycle record in .claude/canvas/cycle-history.yml with the rework fields. This is the denominator — the hidden cost of delivery that velocity metrics miss.

If this retrospective is for a just-completed cycle, prompt: "Set a reminder to check rework in 14 days. Run /mycelium:retrospective rework-check [cycle-id] after that."

Source: Paddo (the denominator problem — 43% of AI-assisted code requires post-delivery debugging). Forsgren (change failure rate as a trailing indicator).

Root Cause Analysis (when "What Didn't Go Well" surfaces a significant problem)

Use these two complementary techniques. Fishbone gives breadth (all possible causes). 5 Whys gives depth (one cause traced to its root).

Fishbone Diagram (Ishikawa)

Map all potential causes before investigating any. Structure:

                        ┌─ People (skills, handoffs, communication)
                        ├─ Process (gates, cadence, workflow)
Problem ◄───────────────├─ Product (canvas, evidence, assumptions)
(effect)                ├─ Platform (tools, infra, dependencies)
                        ├─ Principles (which theory/guardrail failed?)
                        └─ Pressures (deadlines, scope, external)

Write the specific problem at the head
Brainstorm causes in each category — add as branches
Drill into sub-causes until you reach actionable items
Vote/rank the most likely root causes for investigation

Ishikawa's original 6M manufacturing categories: Man (Manpower), Machine, Method, Material, Measurement, Mother Nature (Environment). Adapted for product development as: Man→People, Machine→Platform, Method→Process, Material→Product (inputs to the work), Measurement→Principles (what we measure against), Mother Nature→Pressures (external forces).

5 Whys (Toyoda)

For the top-ranked cause from the fishbone, ask "why?" five times:

Why did this happen? → [first-level cause]
Why did that happen? → [second-level cause]
Why? → [deeper]
Why? → [deeper]
Why? → [root cause — usually systemic]

Stop rule: Stop when ANY of these conditions are met:

You reach something you can change systemically (a guardrail, gate, or process step)
Asking "why" again would require speculation rather than verifiable fact
You reach a cause outside your sphere of influence (an escalation point, not a dead end)
The answer would be the same regardless of asking "why" (you've hit bedrock)

Anti-pattern: Stopping at "human error" — that's never the root cause. Ask why the system allowed the error.

Source: Ishikawa (cause-and-effect diagrams), Toyoda/Ohno (5 Whys), adapted for agentic product development.

Waste Identification (Ohno — 7 Wastes / TIMWOOD)

"Eliminating waste is the foundation of lean." (Ohno)

Before root cause analysis, identify which waste category the problem falls into:

Waste	Product Development Form	Detection
Transportation	Handoffs between people/teams, between discovery and delivery	Count handoffs in the value stream
Inventory	WIP, unshipped code, unfinished features, unmerged branches, open PRs	Check WIP limits, branch age
Motion	Context switching between tasks, tools, codebases	Track focus time vs fragmented time
Waiting	Blocked tasks, review queues, approval bottlenecks, blocked dependencies	Measure wait-to-work ratio
Overproduction	Building features nobody uses, YAGNI violations	Compare shipped features to validated needs
Overprocessing	Gold-plating, unnecessary abstraction, premature optimization	"Would removing this step reduce value?"
Defects	Bugs, rework, corrections, failed deployments	Track defect escape rate

Also watch for: Muri (overburden → BVSSH Happier / sustainable pace) and Mura (unevenness → delivery cadence variation).

Source: Taiichi Ohno, Sakichi Toyoda (Toyota Production System). Mapped to product development via Poppendieck (Lean Software Development).

Blameless Post-Mortem Format (SRE)

For incidents or significant failures, use the SRE blameless post-mortem:

Timeline: What happened, when, in what order
Impact: Who was affected, how severely, for how long
Contributing factors: What conditions led to this (NOT "who caused it")
Root cause: The systemic issue, not the human action (use fishbone + 5 Whys above)
Action items: Specific, assigned, time-bound improvements
What went well: What prevented it from being worse

Rule: No blame. Focus on the system, not the person. Source: Beyer et al. (SRE)

Refactoring Prompt

After delivery retrospective, always ask:

"Are there refactoring opportunities? Duplicated logic (DRY)? Unnecessary complexity (KISS)?" Source: Beck (XP), Fowler (Refactoring)

Output

Update .claude/memory/corrections.md with new corrections
Update .claude/memory/patterns.md with new patterns
Update .claude/memory/delivery-journal.md with retrospective entry
Update .claude/canvas/bvssh-health.yml if dimensions changed
Log in .claude/harness/decision-log.md
Record cycle in .claude/canvas/cycle-history.yml (see Cycle History Recording above)

Counter-Argument Check (Bias Mitigation)

Before finalizing the retrospective, draft a one-line counter-argument for each major claim: "What's the strongest case that this 'went well' was actually luck? That this 'went wrong' was actually unavoidable? That this 'pattern' is actually noise?" If you can't articulate counter-cases, run /mycelium:devils-advocate before locking in the corrections/patterns.

This addresses the bias cluster documented in corrections.md (L5 sycophancy 2026-04-20, eval overfitting 2026-04-30, sharper-framing-isn't-righter 2026-05-03). Retrospectives are particularly bias-prone — narrative coherence is rewarded, the agent is incentivized to find tidy patterns, and post-hoc rationalization is the natural mode. Counter-arguments break that gravity.

Hindsight Bias Check

Retrospectives are the natural home of hindsight bias — the "I knew it all along" effect that rewrites uncertainty as foreknowledge. For every claim of the form "we should have seen X coming," ask: would I have predicted X with the evidence available BEFORE the outcome? If the honest answer is "no, that evidence only became diagnostic in retrospect," log it as a learning about evidence interpretation, not as a missed signal. This protects future retrospectives from manufacturing false should-have-knowns that distort confidence calibration.

Source: Fischhoff, "Hindsight ≠ Foresight: The Effect of Outcome Knowledge on Judgment Under Uncertainty" (1975).

Especially important when proposing graduation candidates (recurring corrections → guardrails) — make sure the recurrence is real, not 3 instances of pattern-matching by the agent itself.

name	retrospective
description	Structured retrospective after completing a delivery increment or diamond. Captures learning for continuous improvement.
metadata	{"instruction_budget":"58","framework_dependency":"mycelium","framework_dependency_note":"This skill is designed to run within the Mycelium framework (https://github.com/haabe/mycelium). Standalone use will skip the canvas state, theory gates, and harness behavior the skill assumes. Install: /plugin install mycelium@haabe/mycelium."}

retrospective

More from this repository

More from this repository

Retrospective

Preflight: Read target canvas file(s) before any Write/Edit

Workflow

Step 1. Record Cycle in .claude/canvas/cycle-history.yml AND Decision Log (MANDATORY — DO THIS FIRST)

Step 5a. Write cycle record to .claude/canvas/cycle-history.yml

Step 5b. Log cycle calibration summary in .claude/harness/decision-log.md

Step 2. What Went Well?

Step 3. What Didn't Go Well?

Step 4. What Should Change?

Step 5. BVSSH Dimension Check

Step 6. Rework Follow-Up (14-day window)

Root Cause Analysis (when "What Didn't Go Well" surfaces a significant problem)

Fishbone Diagram (Ishikawa)

5 Whys (Toyoda)

Waste Identification (Ohno — 7 Wastes / TIMWOOD)

Blameless Post-Mortem Format (SRE)

Refactoring Prompt

Output

Counter-Argument Check (Bias Mitigation)

Hindsight Bias Check

Retrospective

Preflight: Read target canvas file(s) before any Write/Edit

Workflow

Step 1. Record Cycle in .claude/canvas/cycle-history.yml AND Decision Log (MANDATORY — DO THIS FIRST)

Step 5a. Write cycle record to .claude/canvas/cycle-history.yml

Step 5b. Log cycle calibration summary in .claude/harness/decision-log.md

Step 2. What Went Well?

Step 3. What Didn't Go Well?

Step 4. What Should Change?

Step 5. BVSSH Dimension Check

Step 6. Rework Follow-Up (14-day window)

Root Cause Analysis (when "What Didn't Go Well" surfaces a significant problem)

Fishbone Diagram (Ishikawa)

5 Whys (Toyoda)

Waste Identification (Ohno — 7 Wastes / TIMWOOD)

Blameless Post-Mortem Format (SRE)

Refactoring Prompt

Output

Counter-Argument Check (Bias Mitigation)

Hindsight Bias Check

Step 1. Record Cycle in `.claude/canvas/cycle-history.yml` AND Decision Log (MANDATORY — DO THIS FIRST)

Step 5a. Write cycle record to `.claude/canvas/cycle-history.yml`

Step 1. Record Cycle in `.claude/canvas/cycle-history.yml` AND Decision Log (MANDATORY — DO THIS FIRST)

Step 5a. Write cycle record to `.claude/canvas/cycle-history.yml`