// Evaluate reasoning effectiveness after completing a session to learn what worked, what didn't, and what to try differently in future work Use when: (1) asked to learn from how a completed reasoning approach went, (2) after finishing significant problem-solving, design, or decision-making tasks, (3) when a completed reasoning approach worked particularly well or particularly poorly, (4) after trying a new combination of skills to evaluate its retrospective effectiveness
| name | introspect-session-reflect |
| description | Evaluate reasoning effectiveness after completing a session to learn what worked, what didn't, and what to try differently in future work Use when: (1) asked to learn from how a completed reasoning approach went, (2) after finishing significant problem-solving, design, or decision-making tasks, (3) when a completed reasoning approach worked particularly well or particularly poorly, (4) after trying a new combination of skills to evaluate its retrospective effectiveness |
Capture what a session revealed about effective reasoning so future sessions can recognize similar situations and respond well.
When reflecting on a reasoning session:
Describe what happened - Not just what you did, but the shape of the problem and how understanding evolved. What did the situation demand? How did it respond to different approaches?
Identify the pivot - Most sessions have a moment where something shifted. What was it? A realization, a reframe, a piece of evidence that changed direction? This is often the most transferable insight.
Name what worked and why - Not just "decompose helped" but why it helped in this specific situation. What about this problem made that approach effective? The reasoning behind success is more valuable than the success itself.
Name what didn't work and why - Same principle. What about this problem made certain approaches ineffective? Understanding failure modes helps recognize when to avoid them.
Extract the transferable pattern - If a future session encounters a similar situation, what should it recognize? What approach should it consider? Write this as a recognition-action pair: "When X, consider Y because Z."
Note the frame - What perspective or approach guided this session? Would a different frame have led somewhere else? Understanding how framing shaped outcomes helps future sessions choose frames deliberately.
{
"what_happened": "Complex auth bug that only appeared for multi-role users. Spent first hour chasing token expiry (the obvious suspect) before realizing the symptoms didn't match. The breakthrough came when I stopped assuming and actually traced the execution path - the error was in role boundary checks, nowhere near token handling.",
"pivot": "The moment I ran trace-flow and saw the actual execution path. My mental model was wrong. The code wasn't doing what I assumed. Once I saw the real flow, the bug was obvious.",
"what_worked": "Structural decomposition followed by execution tracing. Breaking down the auth system into components (decompose) gave me a map, but tracing actual execution (trace-flow) showed me where the map was wrong. The combination was powerful - structure alone would have kept me in the wrong area.",
"what_didnt_work": "Jumping to root-cause analysis before understanding the system. I was so eager to diagnose that I diagnosed based on assumptions rather than evidence. Cost me an hour.",
"transferable_pattern": "When debugging, verify the execution path before hypothesizing causes. The urge to diagnose quickly leads to diagnosing the wrong thing. Sequence: decompose โ trace-flow โ verify understanding โ then root-cause.",
"frame": "Initially framed as 'token auth is broken'. Reframed to 'something in user handling has an edge case'. The reframe opened up the right search space."
}
{
"what_happened": "Deciding between two microservice architectures. Option B was technically superior on paper - better performance, cleaner separation. But during premortem, imagining B failing revealed a single point of failure that wasn't visible in the trade-off analysis. Changed the decision from 'B is better' to 'A is safer'.",
"pivot": "The premortem on Option B. When I imagined 'it's six months from now and B has failed catastrophically - why?' the answer was immediate: the message broker was a single point of failure with no fallback. That risk wasn't in the pros/cons list but it was decisive.",
"what_worked": "Running premortem AFTER trade-space analysis but BEFORE decision. Trade-space showed what each option optimized for. Premortem revealed failure modes that optimization analysis missed. The sequence mattered - premortem without trade-space would have been unfocused; trade-space without premortem would have missed the hidden risk.",
"what_didnt_work": "Started with six-hats for multi-perspective thinking. Useful but would have been more effective after understanding system structure. Perspectives without structural understanding generated concerns that weren't grounded.",
"transferable_pattern": "For high-stakes architecture decisions: understand structure first, then explore trade-offs, then imagine failures, then decide. Failure imagination (premortem) catches risks that optimization analysis misses.",
"frame": "Framed as 'which option is technically better' initially. Should have been 'which option fails more gracefully'. The frame shift changed what counted as important."
}
{
"what_happened": "Stuck in option-generation loops. Kept using branch and alternatives to generate more possibilities, then weigh-options to evaluate them, but couldn't make progress. Generated dozens of options, couldn't choose between them. Eventually realized: I didn't have the information to evaluate any of them.",
"pivot": "Recognizing that the problem wasn't insufficient options but insufficient information. I was generating options to avoid the discomfort of not knowing, but generation can't substitute for evidence.",
"what_worked": "Nothing, until I stopped generating and started gathering. Once I paused option-generation and used knowledge-check to identify what I actually didn't know, the path forward became clear.",
"what_didnt_work": "Generative skills without informational foundation. Branch and alternatives are powerful when you can evaluate the options. When you can't, they just create the illusion of progress.",
"transferable_pattern": "When stuck in option-generation loops, it's usually an information problem not an idea problem. Signal: generating many options but unable to choose between them. Response: stop generating, identify what information is missing, gather it.",
"frame": "Framed as 'I need more options'. Should have been 'I need more information'. The frame determined which skills seemed relevant."
}
Persist session reflections to .claude/.thinkies/.reflections/ for learning and pattern recognition. This directory is automatically gitignored when in a git repository.
Usage:
bun {baseDir}/scripts/save_reflection.ts --session <name> --data '<json>'
Options:
--session, -s: The name of the session being reflected on (should match the notes session name)--data, -d: JSON object containing reflection dataExample:
bun {baseDir}/scripts/save_reflection.ts --session "auth-fix" --data '{"approach": "decompose then trace", "what_worked": ["Breaking down the auth flow first"], "lessons": ["Verify understanding before diagnosing"]}'
Behavior:
.claude/.thinkies/.reflections/ directory and adds it to .gitignore (if in a git repo)<session-name>.json with the reflection dataindex.json in the reflections directory for easy discoveryOutput: The script displays a human-readable summary of the saved reflection, including the session name, file path, timestamp, and reflection content.
Integration: Save reflections after completing significant reasoning tasks. Use the same session name as your notes session to link them together.