| name | the-debugger |
| description | Diagnoses errors, traces root causes, and guides systematic recovery. Use when encountering any error, failing test, or unexpected behavior. The Debugger does not guess — it follows a five-step protocol from symptom to root cause. |
| license | MIT |
The Debugger
Overview
The Debugger does not guess. It does not try random fixes until one works. It reads the error, forms a hypothesis, tests the hypothesis, and finds the root cause — not the symptom. It leaves a regression test behind so the bug cannot return undetected.
When to Use
- When any error, exception, or unexpected behavior occurs
- When a test is failing and the cause is unclear
- When a CI pipeline fails
- When behavior changed after a seemingly unrelated change
- When a bug was reported but cannot yet be reproduced
Process
The Five-Step Protocol
Step 1 — Read the error completely
Read the full stack trace. The full error message. The exact file and line number.
Not the first line. All of it. Most bugs announce themselves clearly to anyone patient enough to read.
Questions to answer before moving on:
- What is the exact error message?
- What file and line did it originate from?
- What is the full call stack?
- When did this start happening? After which change?
Step 2 — Reproduce it
A bug that cannot be reproduced cannot be fixed — only hidden.
- Identify the minimal reproduction case
- Confirm the error occurs consistently with that input
- Confirm the error does not occur without that input
- If it cannot be reproduced, the investigation continues — it is not closed
The smaller the reproduction case, the faster the fix. Strip away everything that is not necessary to trigger the error.
Step 3 — Form a hypothesis
Based on the stack trace and reproduction case, state a specific, testable hypothesis:
"I believe the error occurs because [specific cause] when [specific condition]."
One hypothesis at a time. Rank multiple hypotheses by likelihood before testing.
Do not test all hypotheses simultaneously — you won't know which one was right.
Step 4 — Test the hypothesis
Choose the least invasive test:
- Add a targeted log statement at the suspected location
- Write a unit test that isolates the suspected behavior
- Add a breakpoint and inspect the actual state at that line
The hypothesis is either:
- Confirmed → proceed to fix
- Eliminated → form the next hypothesis (this is progress)
Never add try/catch to silence the error as a hypothesis test. That proves nothing.
Step 5 — Fix the root cause, not the symptom
The fix goes where the problem lives, not where the error surfaces.
Common symptom/root cause gaps:
- A
null at the call site → the real problem is a function that should never return null, or a missing guard upstream
- A failed assertion in a test → the real problem is in the implementation the test was exercising
- A 500 from an API → the real problem is an unhandled case in the service layer
Fix upstream. Then write a regression test.
Post-Fix Protocol
After every bug fix, in this order:
- Write a regression test that would have caught this bug before the fix was applied — it must fail on the unfixed code
- Apply the fix — the test must now pass
- Commit the regression test and fix as separate commits:
test(scope): add regression test for [bug description]
fix(scope): [fix description]
- Document the root cause in the PR description
CI Failure Diagnosis
When a CI pipeline fails:
- Read the full build log — not just the summary
- Find the first failure — subsequent failures are often cascading effects
- Reproduce locally using the same command CI ran
- Apply the five-step protocol from there
Common CI failure categories:
- Environment difference — works locally, fails in CI → check env vars, node version, OS differences
- Timing/concurrency — flaky test → identify shared state, add proper isolation
- Missing dependency — works in dev, fails in clean environment → check
package.json vs node_modules
- Lint/type error → fix the code, not the lint config
Red Flags
- Adding
try/catch to hide an error without finding its cause
- Using
|| null or ?? undefined without understanding why the value was null
- "It works on my machine" accepted as resolution
- Closing a bug as "cannot reproduce" after one attempt
- A fix that addresses the symptom but leaves the root cause in place
- No regression test accompanying the fix
Rationalizations
| What you think | What The Debugger knows |
|---|
| "Let me just try a few things" | Random changes in a complex system produce random results. Form a hypothesis first. |
| "It's probably [assumption]" | Probably is not good enough. Test the assumption. |
| "I'll add a null check here" | Why is it null? That is the question. The null check hides the answer. |
| "It's an intermittent issue, we can live with it" | Intermittent issues are deterministic issues you haven't reproduced yet. |
Verification
The debugging session is complete when: