with one click
retro
// Conversation retrospective and skill improvement. Use when user says "retro", "retrospective", "lessons learned", "improve skills", "what went wrong", "auto-improve", or at the end of a non-trivial work session.
// Conversation retrospective and skill improvement. Use when user says "retro", "retrospective", "lessons learned", "improve skills", "what went wrong", "auto-improve", or at the end of a non-trivial work session.
[HINT] Download the complete skill directory including SKILL.md and all related files
| name | retro |
| description | Conversation retrospective and skill improvement. Use when user says "retro", "retrospective", "lessons learned", "improve skills", "what went wrong", "auto-improve", or at the end of a non-trivial work session. |
| requires | ["rules","workspace"] |
| compatibility | macOS/Linux, any project with teatree skills. |
| triggers | {"priority":100,"end_of_session":true,"keywords":["\\b(retro|retrospective|lessons learned|improve skills?|auto.?improve|what went wrong)\\b"]} |
| search_hints | ["retro","retrospective","lessons learned"] |
| metadata | {"version":"0.0.1","subagent_safe":false} |
requires:.t3 CLI commands. Load /t3:workspace now if not already loaded.Optional: If T3_REVIEW_SKILL is configured (e.g., ac-reviewing-codebase), retro recommends running it after skill modifications for deeper architectural quality assurance. Retro is lightweight and tactical; the review skill is methodical and systematic.
Retro's behavior depends on these ~/.teatree variables and on whether the current repo contains an overlay package:
T3_CONTRIBUTE — false (default) or true:
false: only improve the active project overlay. Core skill gaps are noted in conversation but not acted on.true: also improve core skills in $T3_REPO. Retro creates a local commit; whether it then pushes is governed by the mode resolution below.t3:rules § Publishing Actions Are Mode-Conditional. Resolve the effective mode in the prescribed order (T3_MODE env → active overlay config → global [teatree] table → per-repo memory overrides → default interactive).
auto: push immediately after the privacy scan passes — no prompt, no "your call", no deferral to /t3:contribute. Open the PR if one doesn't already exist.interactive: commit locally and remind the user to run /t3:contribute.T3_PUSH / T3_AUTO_PUSH_FORK env vars are honored only when no mode is configured anywhere; the mode setting subsumes them and wins on conflict. Do not gate retro pushes on T3_PUSH when mode = "auto" is set in ~/.teatree.toml. A fork-vs-upstream split (origin ≠ T3_UPSTREAM) does not change the push decision; it only affects whether /t3:contribute opens an upstream issue afterward.T3_UPSTREAM — upstream GitHub repo (e.g., souliane/teatree). Used by /t3:contribute to open issues upstream after pushing. When origin matches T3_UPSTREAM, pushes already land directly on upstream.T3_PRIVACY — privacy check strictness: strict (default) or relaxed. See § Privacy Scan.T3_REVIEW_SKILL — name of an external skill review tool (e.g., ac-reviewing-codebase). If set, retro recommends running it after skill improvements. If not set, retro suggests installing one during first run and storing the preference.Retro is agent-platform neutral. The workflow, ~/.teatree variables, and teatree slash commands stay the same across platforms.
Systematic review of the current conversation to extract failures, near-misses, and lessons learned, then improve the skill system so they never recur.
When to run (proactively — do NOT wait for the user to ask):
/t3:retroIf retro is in progress when compaction is imminent (long conversation, many tool calls), write findings to a temporary file immediately before they are lost. Use the t3-snapshot- prefix so the PostCompact hook can find and inject the file back into context automatically:
cat > /tmp/t3-snapshot-${CLAUDE_SESSION_ID:-manual}-$(date +%Y%m%d-%H%M).md <<'EOF'
# Retro Findings (pre-compaction snapshot)
<paste categorized findings here>
EOF
Recovery is automatic. The teatree PostCompact hook scans for t3-snapshot-*.md files and injects their content as additionalContext after compaction. You do not need to remember to read the file — it will appear in your context. Delete the temp file once findings are persisted to durable skill files.
Retro works from any conversation — not just teatree-managed projects. It identifies which skills were used in the session and determines where improvements should go.
Scan the conversation for loaded skills (skill tool invocations, system reminders mentioning skills, explicit /skill calls). Build a list of every skill that influenced the session.
For each skill, resolve its real path (follow symlinks) and check whether it lives in a git repository:
real_path=$(readlink -f "<skill_dir>")
git -C "$real_path" rev-parse --git-dir >/dev/null 2>&1 && echo "editable" || echo "read-only"
| Editability | Where to write improvements |
|---|---|
| Editable (symlink → local git repo) | Improve the skill files directly (following the write rules in § Fix Skills) |
| Read-only (no git repo, installed copy, or remote-only) | Write to the best available fallback: repo-level agent instructions, user-level agent config, or user memory files. Choose whichever is closest to the point of use. |
When writing to fallback locations, clearly mark the entry as originating from a retro finding: include the skill name and a brief rationale so the entry can be promoted to the skill later if it becomes editable.
If you can't determine whether a skill is editable, or if you're unsure whether an improvement belongs in the skill vs. the agent config vs. memory — ask the user. Retro is meta-work; human-in-the-loop is expected.
Retro is not complete until every confirmed finding is written to a durable home in the same retro pass. Conversation output is not durable storage.
Retro output must include a persistence summary:
When retro runs for a teatree-managed ticket, mark the retro phase on the active session so the t3 <overlay> pr create shipping gate can enforce retro-before-push:
t3 <overlay> lifecycle visit-phase <ticket_id> retro
Skip this step when retro runs outside a ticket context (no session exists). The shipping gate fails open when no session is found, so skipping is safe — the marker only matters when a session is already tracking phases.
Retro should optimize for speed with repeatability. Use AI for judgment and synthesis; use scripts for deterministic evidence gathering and bulk transformations.
flowchart TD
A["Session ends or user triggers retro"] --> B["1. Conversation Audit"]
B --> C["Categorize every issue:<br/>false completion, skill gap,<br/>playbook miss, over/under-engineering,<br/>hook gap, stale guidance"]
C --> D["2. Root Cause Analysis"]
D --> E["Why did each issue happen?<br/>Missing guardrail? Vague verification?<br/>Skill not loaded? Outdated step?"]
E --> F["3. Fix Skills"]
F --> EA{"Skill editable?<br/>(local git repo)"}
EA -->|"Yes"| G{"Where does the fix go?"}
EA -->|"No (read-only)"| J["Write to agent config<br/>or memory files"]
G -->|"Project-specific"| H["Write to active overlay app<br/>(troubleshooting, playbooks, guardrails)"]
G -->|"Core skill gap<br/>(T3_CONTRIBUTE=true)"| I["Write to $T3_REPO<br/>(skill files, references, hooks)"]
G -->|"User preference"| J
H & I & J --> SIMP["3b. Simplification Pass<br/>(remove duplicate / stale /<br/>unused rules and checks)"]
SIMP --> K["4. Quality Checks"]
K --> L["No duplication across skills?"]
K --> M["Single source of truth?"]
K --> N["Pre-commit hooks pass?"]
K --> O["Tests pass?"]
L & M & N & O --> P{"T3_CONTRIBUTE=true?"}
P -->|"Yes"| Q["5. Commit on current branch<br/>(worktree, never main clone)"]
P -->|"No"| R["Done — overlay improved"]
Q --> S["6. Privacy Scan<br/>(emails, paths, keys, banned terms)"]
S --> MODE{"Effective mode<br/>(see § Configuration)"}
MODE -->|"auto"| PUSH["Push + open PR<br/>(per t3:rules)"]
MODE -->|"interactive"| CONTRIB["User runs /t3:contribute later<br/>to review, push, open upstream issue"]
Scope-match check first (Non-Negotiable). Before auditing individual failures, re-open the ticket/issue body that framed this session and map every acceptance criterion, phase, or deliverable to what actually shipped. If ANY AC is unshipped and the session was declared complete (MR merged with Closes/Fixes, /t3:next run, ticket marked done), that is a False completion finding and it outranks every tactical finding below. Re-reading the issue body is not optional — scoping→implementation drift is invisible from the conversation alone.
Check dashboard server logs next. Inspect the teatree dashboard log for errors that may not have surfaced in the conversation:
LOG="$HOME/.local/share/teatree/$(basename "$PWD")/logs/dashboard.log"
[ -f "$LOG" ] && grep -i "error\|traceback\|exception\|critical" "$LOG" | tail -30
Errors in the log (500s, tracebacks, failed task launches) are retro findings even if the user didn't mention them. Categorize them alongside conversation issues below.
Review the full conversation and categorize every issue:
| Category | Description | Example |
|---|---|---|
| False completion | Claimed "done" without verifying all requirements | Declared feature complete without running the full test suite |
| Skill not loaded | A relevant skill existed but wasn't loaded | Didn't load the active project overlay skill when working in project context |
| Playbook not consulted | A playbook covered the task but wasn't read | Didn't check the relevant playbook for the translation checklist |
| Over-engineering | Did unnecessary work because of wrong assumptions | Planned enum/migration/serializer changes when admin config sufficed |
| Under-engineering | Missed required work | Only updated the backend without the corresponding frontend changes |
| Hook gap | Auto-loading didn't trigger when it should have | Hook didn't suggest project overlay in matching context |
| Stale guidance | Followed outdated instructions | Playbook described pre-refactoring patterns |
| Paradigm mismatch | The architecture itself is the bottleneck, not a missing skill or guardrail | Repeatedly refining skill prose for a workflow that should be deterministic code; 3+ retro findings pointing to the same structural limitation; system untestable without an LLM |
| Overhead without value | A rule, check, or procedure added friction this session without preventing a real failure | Verification step that never flagged anything; duplicated guardrail across skills; step-by-step commands the CLI already handles. Fed into § 3b Simplification Pass. |
For each issue, determine why it happened:
Pre-write editability check: Before writing to ANY skill, verify it is editable (see § Scope & Editability). For teatree-specific paths:
# Check core (when T3_CONTRIBUTE=true)
git -C "$T3_REPO" rev-parse --git-dir >/dev/null 2>&1 || echo "STOP: T3_REPO is not a git repo"
If a skill is not editable (no local git repo), write improvements to the best fallback location — repo-level agent instructions, user config, or memory files. See § Scope & Editability for the full decision table. In standalone mode with no overlay project, skip the overlay check.
Load coding skills before implementing: Retro fixes often involve writing code (Python, Django, shell). Load the appropriate coding skill (/ac-django, /ac-python, etc.) before implementing — not just for model/view work but for any code: settings, logging, CLI commands, hook scripts. Retro is not exempt from coding standards.
Determine the target based on T3_CONTRIBUTE and the nature of the fix:
These go to the overlay regardless of contribution level:
| What to fix | Where to write | Format |
|---|---|---|
| Non-obvious fix or recurring failure | <overlay app>/references/troubleshooting.md or repo AGENTS.md if no overlay refs exist | symptom -> root cause -> fix -> prevention |
| New repeatable multi-step pattern | <overlay app>/references/playbooks/<topic>.md + update README.md | step-by-step guide |
| Outdated playbook step | Update the overlay playbook directly | delete/replace stale instructions |
| "Do this, not that" guardrail | <overlay app>/references/playbooks/archive-derived-guardrails.md | do this / not that pair |
T3_CONTRIBUTE=false (default)Do NOT modify files under $T3_REPO. If you detect a gap in a core skill, note it in conversation output so the user is aware, but take no action on core files.
T3_CONTRIBUTE=trueRetro can also modify core teatree skills in the user's fork:
| What to fix | Where to write |
|---|---|
| Infrastructure/worktree failure | $T3_REPO/skills/workspace/references/troubleshooting.md |
| Hook should have triggered | $T3_REPO/hooks/scripts/hook_router.py or the relevant hook script |
| Missing verification step | The core skill that owns that workflow phase |
| Stale or incorrect guidance in a core skill | The affected skill's SKILL.md or reference file |
After modifying core skills: follow § Commit to Fork.
Retro should remove overhead with the same confidence it adds guardrails. Most skill drift comes from accumulation — rules layered on over time, each defensible in isolation, collectively expensive. Every retro must ask: did any rule or check create friction this session without preventing a real failure? If yes, simplify in the same commit as the other findings.
CLAUDE.md. Keep one canonical home; replace others with a one-line cross-reference.t3 already does the work (see § "Never write CLI procedures into skills" above).--no-verify bans, deletion approvals. Cost is ~0 tokens per turn; blast radius is real.CLAUDE.md entry). Ask before removing.rules/SKILL.md or the most relevant dedicated skill); replace duplicates with one-line pointers (See <skill>/SKILL.md § <anchor>). Keep anchors stable so cross-references don't break.t3 <command>" (stale) or "never triggered across N retros" (unused).Use refactor(<skill>): simplify <what> (not fix(<skill>)). One commit per coherent simplification so reverts stay surgical. Example: refactor(ship): drop duplicate push-confirmation rule — canonical in rules/SKILL.md.
If a rule looks like overhead but you cannot confirm it is unused, ask with AskUserQuestion. Show the rule, show grep evidence of recent invocations, and propose remove vs. keep. The cost of asking is low; the cost of removing a load-bearing rule is high.
(Source: AGENTS.md § <section>) or equivalent. A duplicate without a reference is a duplication bug that will drift silently.version: in YAML frontmatter — that's auto-managed.draft: false in frontmatter are published — never modify them. Draft content (draft: true or no frontmatter) may be improved.t3 already executes). Before writing a finding that includes a command or procedure, check: does t3 already handle this? If yes, the skill should say "use t3 <command>" — not reproduce the steps the CLI performs internally. Procedural documentation belongs in BLUEPRINT.md, AGENTS.md, CLAUDE.md, README.md, or docs/ — not in skills. Violating this tempts agents to follow the documented manual steps instead of calling the CLI.SKILL.md, references/) over writing to user-specific config (the agent's personal config and memory files). Skills benefit ALL users; personal config only helps one machine. Memory/config files are only for: user preferences (formatting, tone), environment-specific facts (paths, usernames, credentials), and user-specific workflow choices. Guardrails, troubleshooting, patterns, and "do this not that" rules belong in skills. Checklist before writing to memory/config: "Would another user of these skills need this too?" — if yes, put it in a skill. When in doubt, prefer skill files over personal config — skills are portable, personal config is not.<skill> § <section>" to prevent drift. Only fully remove entries that are truly redundant (pure cross-references with no actionable content).Past failure (2026-MM-DD): …, Known failure (#NNN): …), "I did X / the user did Y" anecdotes, and PR-specific case studies are personal-memory material; they identify a specific session and accumulate as session-narrative noise in a public skills repo. The rule's value is intrinsic — it should read coherently to a reader who never saw the incident. If a one-line anti-pattern bullet captures the failure mode (e.g., "returning an error string from a management command instead of raising"), keep that; otherwise, drop the citation entirely and trust the rule.WHEN to create a new playbook:
WHEN to update an existing playbook:
WHERE to create playbooks:
<project-skill>/references/playbooks/<scope>-<topic>.md<project>- (backend), frontend- (frontend), cross-repo- (multi-repo), none (process)README.md index with the new entryPlaybook staleness check: Before following any playbook, verify instructions against current code. If the codebase has moved to a config-driven approach or the referenced pattern no longer exists, the playbook is stale — fix it immediately.
After completing all retro changes, check for unpushed work across ALL repos touched during the session. The goal is to ensure no work is forgotten — orphaned branches, stashes, and uncommitted changes are all risks.
For each touched repo, collect and display:
git log --oneline @{u}..HEADgit config init.defaultBranch (fallback: main), then list all other local branches with git branch --no-merged <main> — these may contain in-progress workgit stash list — stashes are easy to forget and may contain important WIPgit status --short — show the summary, not just "dirty"Co-Authored-By trailers (should be removed per user's global config)Before treating any local branch as "unpushed work", cross-reference against the default branch. Squash-merges create new SHAs, so git log --not --remotes by SHA alone will flag merged branches as unsynced.
Delegate this to the CLI: run t3 teatree workspace clean-all. It classifies each branch's unsynced commits into squash_merged (subject matches a commit on origin/main after stripping (#NNN) suffix and conventional-commit type prefix), merge_commits (multi-parent — safe to discard), and genuinely_ahead (real pending work). Only genuinely-ahead branches block cleanup.
Inside a TTY, clean-all prompts for each blocked worktree — [P]ush to remote / [A]bandon (force delete) / [S]kip. In a non-TTY context it preserves the old skip-and-report behaviour. Reach for the subject-matching Python recipe only when you need to classify raw stashes or stray local branches outside a tracked worktree.
After applying all fixes:
prek run --all-files to validateT3_CONTRIBUTE=true)When T3_CONTRIBUTE=true and retro modified files under $T3_REPO, commit automatically on the session's working branch inside a worktree (never the main clone, never main). The commit is local-only — /t3:contribute handles the push.
See references/commit-to-fork.md for pre-flight checks, branch selection rules, the confirmation template, and the T3_AUTO_PUSH_FORK exception.
Before committing to the fork or creating an upstream issue, scan all public-facing content the agent has authored or is about to author this session — not just the diff of newly-staged files.
Full branch-vs-base diff, not just the current session's hunks.
git -C "$T3_REPO" diff @{upstream}..HEAD | t3 tool privacy-scan -
The branch may carry older commits from prior sessions or compacted work that the agent never re-read. git diff @{upstream}..HEAD covers every commit between the pushed base and HEAD. git diff --cached or git diff HEAD~..HEAD is not enough — it only shows the most recent work.
Commit subjects and bodies on the branch.
git -C "$T3_REPO" log --format='%H %s%n%b' @{upstream}..HEAD | t3 tool privacy-scan -
Commit messages are public and indexed. A subject that names what was scrubbed leaks the fact of the scrub even when the diff itself is clean.
PR, issue, and comment bodies the agent has written this session. Before declaring retro complete, grep every published artifact — PR descriptions you authored, PR/issue comments you posted, release notes, changelogs, and the branch name itself. Internal IPs, /Users/… paths, customer names, ticket IDs, or class-of-data words can slip in here even when the code diff is clean.
Memory and config files written this session. Fresh memory writes to MEMORY.md or per-memory files can repeat a leaked string verbatim ("the leaked value was …"). Reference the incident without reproducing the string.
Run the standard t3 tool privacy-scan detectors (emails, /Users/ and /home/ paths, private IPs, API keys glpat- / sk- / ghp_, internal hostnames, and T3_BANNED_TERMS).
In addition, when the session involved remediating a leak, grep the Streisand-effect word list from rules/SKILL.md § "Leak Remediation — Silent Scrubs":
leak|scrub|redact|real|private|personal|sensitive|accident|phone|email|password|token|credential|secret|address
A hit on those words in a commit subject, branch name, or public comment means the remediation itself amplifies the leak — rewrite or delete before declaring done.
T3_PRIVACY levelsstrict (default): Exit 1 on ANY finding. Require user to manually resolve before proceeding.relaxed: Warn on findings but exit 0. Pass --no-strict to the script.When T3_PRIVACY is not set, default to strict.
t3 handles it, say "use t3 <command>" — don't reproduce the steps. Procedural docs belong in BLUEPRINT.md/AGENTS.md/docs, not skills.interactive mode — use /t3:contribute for push + upstream issue creation. In auto mode, push directly per § Configuration.During every retro, scan the agent's personal config and memory files.
Discovery: Memory files are platform-specific. Discover them dynamically:
~/.claude/projects/*/memory/MEMORY.md — each match is an index file; read it to find individual memory files in the same directory.CLAUDE.md, .cursorrules, AGENTS.md, or similar agent config in the project root.Actions:
(Also in: ...) or containing domain knowledge that belongs in a skill file. Propose promoting them — the (Also in: ...) marker indicates the entry was intentionally duplicated as a safety net, but the authoritative source should be verified and kept current.If T3_REVIEW_SKILL is configured and skill files were modified during this retro:
/$T3_REVIEW_SKILL).T3_REVIEW_SKILL is NOT configured, include this note in the retro output: "Consider installing a skill review tool for periodic deep quality audits. Set T3_REVIEW_SKILL in ~/.teatree to enable integration."