| name | crucible |
| description | Stress-test any idea, proposal, or plan by finding its weakest points, probing hidden assumptions with Socratic questions, and identifying failure modes through red team analysis. Use when the user asks to stress-test, poke holes, find weaknesses, red team, challenge, or ask 'what could go wrong' or 'what am I missing'.
|
| argument-hint | idea, plan, or proposal to stress-test |
Crucible
Context: $ARGUMENTS
Find everything wrong with this idea. Socratic questioning, hostile review, and adversarial simulation.
Workflow
1. Initialize (or pick up existing graph)
Check if a graph already exists from a prior steelman:
uv run ${CLAUDE_PLUGIN_ROOT}/scripts/argument-graph.py show
If empty or no prior state, build the graph from scratch. Identify the idea's core claims and assumptions and register them:
uv run ${CLAUDE_PLUGIN_ROOT}/scripts/argument-graph.py reset
uv run ${CLAUDE_PLUGIN_ROOT}/scripts/argument-graph.py add "<claim-id>" claim --obs "<fact>"
uv run ${CLAUDE_PLUGIN_ROOT}/scripts/argument-graph.py add "<assumption-id>" assumption --obs "<fact>"
uv run ${CLAUDE_PLUGIN_ROOT}/scripts/argument-graph.py relate "<claim-id>" assumes "<assumption-id>"
If a graph already exists, use it directly.
2. Socratic Elenchus (probe assumptions)
Read ${CLAUDE_SKILL_DIR}/references/socratic-method.md.
Apply Vlastos' steps:
- Identify the thesis (what the idea claims)
- Identify premises the proponent would agree with
- Show how those premises lead to conclusions that CONTRADICT the thesis
Ask 3-5 penetrating questions. Each question targets a specific assumption. Present the questions to the user.
3. Structural analysis
Run the graph analysis to find gaps the questions should have exposed:
uv run ${CLAUDE_PLUGIN_ROOT}/scripts/argument-graph.py analyze
This reveals: ungrounded assumptions (no evidence), unstable claims (depend on contradicted assumptions), exposed weaknesses (no mitigation), contested claims (multiple counter-arguments), orphaned entities (disconnected from the argument).
4. Hostile review
Read ${CLAUDE_SKILL_DIR}/references/crucible.md.
For each stakeholder who would need to approve or fund this idea, ask: what is their harshest possible question? Can the idea survive it as stated?
Register weaknesses found:
uv run ${CLAUDE_PLUGIN_ROOT}/scripts/argument-graph.py add "<weakness-id>" weakness --obs "<what makes it weak>"
uv run ${CLAUDE_PLUGIN_ROOT}/scripts/argument-graph.py relate "<assumption-id>" if_fails "<weakness-id>"
5. Red team (failure modes)
Read ${CLAUDE_SKILL_DIR}/references/red-team.md.
Identify 3-5 ways this idea fails. For each failure mode:
- What happens?
- How likely is it?
- How bad is it?
- Is there a defense?
If mitigations exist, register them:
uv run ${CLAUDE_PLUGIN_ROOT}/scripts/argument-graph.py add "<mitigation-id>" mitigation --obs "<how it helps>"
uv run ${CLAUDE_PLUGIN_ROOT}/scripts/argument-graph.py relate "<mitigation-id>" mitigates "<weakness-id>"
6. Present findings
Run analysis again to see the updated picture:
uv run ${CLAUDE_PLUGIN_ROOT}/scripts/argument-graph.py analyze
Output format
Present three sections:
Socratic Questions: 3-5 questions and the assumptions they target
Weakest Points: What a hostile reviewer would attack (from structural analysis + review)
Failure Modes: How this fails, likelihood, severity, and whether defenses exist
Resources
${CLAUDE_SKILL_DIR}/references/socratic-method.md — load always. Vlastos' elenchus steps, the maieutic method, Boghossian's warning about humiliation vs. productive discomfort.
${CLAUDE_SKILL_DIR}/references/crucible.md — load always. The stress-test review board method, NASA/Pentagon origin, the principle that the review must be MORE adversarial than the real presentation.
${CLAUDE_SKILL_DIR}/references/red-team.md — load always. RAND Corporation origins, Israel's Ipcha Mistabra, groupthink as the enemy, reconnaissance before attack.
${CLAUDE_PLUGIN_ROOT}/scripts/argument-graph.py — run throughout to register weaknesses, mitigations, and analyze structural gaps.
Gotchas
- Run
analyze before writing findings. The graph catches things intuition misses: assumptions with zero supporting evidence, claims that depend on contradicted assumptions.
- Nemeth's warning applies here too. Generic "but what if it fails?" is worthless. Every objection must cite a specific mechanism of failure.
- If a graph already exists from a prior
/steelman, use it. Don't rebuild from scratch. The crucible builds on what steelman established.
- Graph entity names: short kebab-case identifiers. Observations carry the detail.
- The graph state persists in /tmp. If
/verdict runs after this, it reads the same graph.