Exécutez n'importe quel Skill dans Manus
en un clic

Exécutez n'importe quel Skill dans Manus en un clic

modes-of-reasoning-project-analysis

Étoiles0

Forks0

Mis à jour8 mai 2026 à 12:42

Multi-perspective project analysis via NTM reasoning-mode agent swarm. Use when "modes of reasoning", "multi-perspective analysis", or "epistemological review" of any project.

Installation

Installer avec Codex ou Claude Copiez ce prompt, collez-le dans Codex, Claude ou un autre assistant, puis laissez-le vérifier la page du skill et l'installer pour vous.

Exécuter dans Manus

Source

oysteinkrog

oysteinkrog/dotfiles

Ouvrir le dépôt GitHub Voir les dépôts du créateur

Téléchargement

Exécuter dans Manus

Métiers associésSOC

Basé sur la classification professionnelle SOC

Développeurs de logicielsProfessions informatiques et mathématiques·SOC 15-1252

Explorateur de fichiers

10 fichiers

SKILL.md

readonly

Plus depuis ce dépôt

même dépôt

consult-oracles

oysteinkrog/dotfiles

Consult Fable (primary oracle) for expert second opinions; escalate to GPT-5.5-Pro only for extremely important or complex tasks (always paired with Fable). Use for complex decisions, architecture choices, debugging hard problems, or when user says "consult oracles", "ask the experts", or wants a second opinion.

2026-06-100

oracle-review

oysteinkrog/dotfiles

Run iterative oracle + agent hardening loop on any artifact (designs, plans, beads, architecture) until findings converge to near-zero. Combines /swarm-oracle with /swarm-review in alternating rounds. Use for the full hardening cycle, not just a single oracle pass. For oracle-only, use /swarm-oracle. For bead-only hardening, use /swarm-beads-quality.

2026-06-100

oracle-consensus

oysteinkrog/dotfiles

Run 2x oracle sessions (FOR + AGAINST stances) to validate design decisions, plans, or bead readiness. Default = two Fable subagents; escalate to PAL 2x GPT-Pro (always paired with Fable) for extremely important or complex validations. Use after design rounds, before implementation, or to challenge architecture decisions.

2026-06-100

sync-human

oysteinkrog/dotfiles

Act as a wise, effective teacher whose goal is to make the human deeply understand the work done in this session (a change, a bug fix, a feature, a design) — i.e. sync the human's mental model up to the agent's. Use when the user says "sync-human", "sync me up", "teach me this session", "make sure I understand", "walk me through what we did", "quiz me on this", or "I want to actually understand this PR/change", or otherwise wants Socratic, gated, incremental teaching with comprehension checks rather than a one-shot summary. Drives understanding at both high level (motivation, impact) and low level (business logic, edge cases) using a running checklist and quizzes.

2026-06-050

agent-mail

oysteinkrog/dotfiles

MCP Agent Mail for multi-agent coordination. Use when agents need file locks, messaging, inboxes, or conflict prevention. Handles macro_start_session, file_reservation_paths, send_message, threading, pre-commit guards.

2026-05-290

secret-lookup

oysteinkrog/dotfiles

Retrieve API tokens, keys, and credentials Oystein has stored locally. Use whenever code, scripts, or shell commands need a secret value: GitHub tokens, Cloudflare, HubSpot, Slack, Zendesk, Jira, Sentry, Anthropic, Apify, Browserbase, Google OAuth, Huma. Use BEFORE searching shell history, session logs, dotfiles, or the filesystem — the canonical store is documented here and the values are reachable via two fish helpers. Also use when adding, rotating, or removing a credential.

2026-05-110

Exécutez n'importe quel Skill en un clic

name	modes-of-reasoning-project-analysis
description	Multi-perspective project analysis via NTM reasoning-mode agent swarm. Use when "modes of reasoning", "multi-perspective analysis", or "epistemological review" of any project.

Modes of Reasoning Project Analysis

Core idea: Spawn 10 agents (default), each assigned a distinct reasoning mode from the 80-mode taxonomy, to analyze a project from radically different analytical lenses. Synthesize their outputs into a single comprehensive report with provenance-tracked findings, triangulated consensus, and scored contributions.

Scope: Not just software bugs. This finds methodological flaws, logical fallacies, misconceptions, reasoning errors, architectural blind spots, missing perspectives, and also generates new ideas, extensions, and radical innovations.

You are the lead agent. You select modes, orchestrate the swarm, monitor progress, apply early-stopping heuristics, collect outputs, triangulate consensus, resolve disagreements with explicit conflict-resolution strategies, score contributions, and compile the final report. The swarm agents do the deep analysis; you do the meta-reasoning and synthesis.

Philosophical foundation: There is no single correct way to reason. "Mode of reasoning" refers to at least four overlapping concepts: inference patterns (deduction vs abduction), uncertainty representations (probability vs fuzzy membership), problem-solving methods (planning, optimization), and domain styles (scientific, legal, ethical). Real-world reasoning is inherently hybrid. The power of this skill comes from making the hybrid explicit and orchestrating diversity with discipline.

The Seven Taxonomy Axes

These axes are the cognitive backbone of mode selection. Most reasoning disagreements trace to mixing up which axis matters. Before selecting modes, identify which axes are load-bearing for THIS project:

Axis	Pole 1	Pole 2	Selection Impact
Ampliative vs Non-ampliative	Conclusions go beyond premises (induction, abduction)	Conclusions contained in premises (deduction)	Discovery needs ampliative; assurance needs non-ampliative
Monotonic vs Non-monotonic	Adding info never retracts conclusions	Adding info can retract conclusions	Most real reasoning is non-monotonic; pure math is monotonic
Uncertainty vs Vagueness	Uncertainty about crisp facts (probability)	Vagueness in predicates themselves (fuzzy)	Don't treat "tall" as a probability problem
Descriptive vs Normative	What is (facts, causes)	What ought (values, duties)	Decisions fail when values hide in "facts"
Belief vs Action	What to believe/accept	What to do/choose	Separating belief from decision improves accountability
Single-agent vs Multi-agent	Uncertainty as noise	Others respond strategically	Security, markets, governance need multi-agent modes
Truth vs Adoption	Accuracy/validity	Audience/coordination	Org failures are often rhetorical, not logical

Rule: Your 10 modes MUST span at least 3 axes. If all modes sit on the same side of every axis, you have an echo chamber. See TAXONOMY_AXES.md for full axis analysis.

Arguments

Parse from invocation text. Defaults:

Argument	Default	Description
`--agents=N`	10	Number of agents (max 10, each gets one mode)
`--cc=N`	5	Claude Code agents
`--cod=N`	5	Codex agents
`--gmi=N`	0	Gemini agents (optional, for 3-model triangulation)
`--project=PATH`	cwd	Target project to analyze
`--focus=TOPIC`	(none)	Optional focus area to bias mode selection
`--depth=LEVEL`	deep	Analysis depth: quick (15min), deep (45min), exhaustive (90min+)
`--creative`	false	Bias mode selection toward innovation/ideation
`--adversarial`	false	Bias mode selection toward attack/critique
`--concise`	false	Compact 7-section report (skip appendices, scoreboard, mode perf notes)
`--output=PATH`	`MODES_OF_REASONING_REPORT_AND_ANALYSIS_OF_PROJECT.md`	Report filename

Phase 0: Context Pack Generation

Before selecting modes, build a rich understanding of the project. This prevents thin-context failures where agents analyze superficially because they lack project understanding.

Project Profiling

# 1. Read ALL project docs
cat README.md AGENTS.md CLAUDE.md CONTRIBUTING.md 2>/dev/null
# 2. Language and framework detection
find . -type f \( -name '*.go' -o -name '*.py' -o -name '*.ts' -o -name '*.rs' -o -name '*.java' \) | head -80
# 3. Structure analysis
find . -type d -not -path './.git/*' -not -path './node_modules/*' | head -40
# 4. Recent trajectory
git log --oneline -30
git diff --stat HEAD~10..HEAD 2>/dev/null
# 5. Open issues / beads if available
br ready --json 2>/dev/null || true
# 6. Test posture
find . -name '*_test.*' -o -name '*.test.*' -o -name 'test_*' | wc -l

Thin Context Detection

If ANY of these are true, ask the user for more context before proceeding:

No README or it's < 100 words
No clear entry point identified
Project purpose is ambiguous after reading docs
You cannot identify the primary programming language
The focus area is specified but you don't understand what it means in this project

Deployment Context (MANDATORY)

Before any analysis, capture the project's real-world deployment context. All severity ratings in the report MUST reference this context, not a hypothetical worst case.

Record:

Who runs this? Solo developer, small team, large org?
Where does it run? Localhost-only, internal network, public-facing internet?
Who are the users? Just the developer, a small team, thousands of end-users, zero users (pre-release)?
What is the threat model? What attackers are realistic? What's the actual attack surface?
Development stage: Early prototype, active development, production, maintenance mode?

Why this matters: A finding that is CRITICAL for a public-facing web service may be LOW for a localhost developer tool. A recommendation that makes sense for a 10-person team is absurd for a solo developer in early development. Three post-mortems showed severity inflation when deployment context was ignored — e.g., rating a localhost API as "CRITICAL" risk when the attacker would need shell access to exploit it (at which point they don't need the API).

Known Limitations Pre-Filter

Before dispatch, extract all documented limitations, caveats, and acknowledged risks from:

README.md, AGENTS.md, CLAUDE.md
Issue trackers, CHANGELOG, TODO files
Inline comments flagging known issues

Tag each as "owner-acknowledged." During synthesis (Phase 6), findings that merely restate owner-acknowledged limitations MUST be categorized separately as "Confirmed Known Risks" rather than presented as discoveries. This prevents the report from spending 30% of its length telling the developer what they already wrote.

Project Identity Check

Identify the project's core substrate — the dependency or concept that defines what the project IS, not just what it uses:

Identity test: "Would removing X change what this project IS?" If yes, coupling to X is a feature, not a bug.

Examples:

A "Named Tmux Manager" — tmux is the identity, not an incidental dependency
A "Rust filesystem" — Rust is the identity
A "React dashboard" — React is a tool choice, not the identity (usually)

Record the core substrate. In Phase 6, any recommendation to "abstract away" or "decouple from" the core substrate MUST be filtered through this identity check. The skill has a documented history of recommending abstraction of the core substrate — this is always wrong.

Project Values and Rules

Read the project's stated values and rules (from AGENTS.md, CLAUDE.md, README). Record them. Examples:

"No tech debt, do it right" → modes penalizing complexity should have their severity discounted
"No premature abstraction" → filter recommendations that add abstraction layers
"Ship fast, iterate" → weight practical modes higher than formal ones

These values inform mode selection (Phase 1) and filter recommendations (Phase 6).

Context Pack (record for later)

Write a mental note (or scratch file) capturing:

Project brief: name, purpose, languages, frameworks, LOC estimate
Architecture: key components, data flow, integration points
Recent trajectory: what has been changing, what direction
Known concerns: from docs, issues, or user's focus area
Stakeholders: who uses this, who maintains it, who depends on it
Success criteria: what "good" looks like for this project
Deployment context: from above (who, where, users, threat model, stage)
Core substrate: from identity check
Project values: from rules extraction
Known limitations: list of owner-acknowledged issues

This context pack will be embedded in every agent's prompt.

Phase 1: Mode Selection

This is the critical step. You must select 10 modes that maximize analytical coverage for THIS specific project. Do not pick randomly.

The Selection Algorithm

Identify the 2-3 most important taxonomy axes for this project (see table above)
Read the full taxonomy in MODE_TAXONOMY.md
Check mode composition stacks in MODE_COMPOSITION.md for proven combinations
Select modes ensuring:
- At least 5 of 12 categories (A-L) represented
- At least 3 of 7 taxonomy axes spanned
- At least 2 modes that directly oppose each other (e.g., worst-case vs option-generation)
- At least 1 meta-reasoning mode (L category) for calibration
- If --creative: at least 3 from B (Ampliative) category
- If --adversarial: at least 3 from F+H (Causal + Strategic) categories
Check against project values -- read the project's stated rules from Phase 0. If the project says "no premature abstraction," bias away from modes that tend to recommend decoupling (B9 Simplicity). If the project says "do it right, no tech debt," don't over-weight modes that penalize complexity.
Record your rationale -- which axes drove the selection, which modes were considered but excluded

Selection Criteria

Project type -- a web app needs different modes than a compiler or a research paper
Known risks -- if the project has security concerns, include adversarial modes
Focus area -- if --focus is specified, bias selection toward relevant categories
Diversity -- spread across at least 5 of the 12 categories (A-L) to avoid blind spots
Complementarity -- pair modes that challenge each other (e.g., deductive + inductive, worst-case + option-generation)
Axis coverage -- ensure both poles of the most important axes are represented
Model diversity -- assign modes to different agent types (Claude/Codex/Gemini) to exploit model-level analytical differences

Recommended Starting Distributions

Software project (general):

Mode	Code	Why	Axis Coverage
Systems-Thinking	F7	See the whole architecture	Descriptive
Root-Cause	F5	Find underlying problems, not symptoms	Descriptive, causal
Deductive	A1	Verify logical consistency of design	Non-ampliative, monotonic
Inductive	B1	Pattern-match from observed code to general flaws	Ampliative, non-monotonic
Adversarial-Review	H2	Stress-test assumptions	Multi-agent, action
Failure-Mode	F4	What can go wrong, will go wrong	Action, uncertainty
Edge-Case	A8	Boundary conditions and corner cases	Non-ampliative
Perspective-Taking	I4	See it from users, maintainers, attackers	Multi-agent, adoption
Counterfactual	F3	What if key decisions were made differently?	Ampliative, belief
Debiasing	L2	Catch cognitive biases in the other 9 modes	Meta-reasoning

Research/methodology project: Replace Edge-Case and Failure-Mode with Bayesian (B3) and Ambiguity-Detection (D2).

Creative/product project: Replace Root-Cause and Adversarial with Conceptual-Blending (B8) and Analogical (B6).

Incident/postmortem: Abduction (B5) + Causal-Inference (F1) + Root-Cause (F5) + Diagnostic (G11) + Counterfactual (F3) + Satisficing (G5) + Second-Order-Effects (F6) + Belief-Revision (E1) + Calibration (L2) + Adversarial (H2)

Strategy/planning: Reference-Class (B10) + Game-Theoretic (H1) + Decision-Analysis (G1) + Robust/Worst-Case (L3) + Negotiation (H3) + Scenario-Simulation (F7) + Sensemaking (I5) + Fermi (B11) + Multi-Criteria (G6) + Strategic-Planning (G2)

See MODE_COMPOSITION.md for 12+ proven mode stacks.

After selecting, record your choices and rationale. You will need this for the report.

Phase 2: Spawn the Swarm

ntm spawn $PROJECT \
  --cc=$NUM_CC --cod=$NUM_COD \
  --no-user \
  --stagger-mode=smart

Wait for agents to be ready:

ntm --robot-wait=$PROJECT --condition=idle --timeout=120

Model-to-Mode Affinity

Not all modes work equally well on all models. Suggested assignments:

Best on Claude (nuanced analysis)	Best on Codex (thorough enumeration)	Either
Systems-Thinking (F7)	Edge-Case (A8)	Root-Cause (F5)
Counterfactual (F3)	Inductive (B1)	Deductive (A1)
Perspective-Taking (I4)	Failure-Mode (F4)	Adversarial (H2)
Debiasing (L2)	Dependency-Mapping (F2)	Option-Generation (B5)
Conceptual-Blending (B8)	Type-Theoretic (A7)	Bayesian (B3)

Phase 3: Dispatch Mode-Specific Prompts

Send each agent a unique, mode-specific prompt. Target agents individually by pane index. Stagger dispatch by 15-20 seconds to prevent thundering-herd effects:

for pane in 0 1 2 3 4 5 6 7 8 9; do
  ntm send $PROJECT --pane=$pane "$(cat <<'PROMPT'
  <INSERT MODE-SPECIFIC PROMPT FROM PROMPTS.md>
  PROMPT
  )"
  sleep 18
done

The prompt for each agent MUST include:

Project context pack -- the brief from Phase 0
Project study directive -- read README, AGENTS.md, understand architecture
Mode assignment -- the specific reasoning mode with full description, failure modes, and differentiator
Taxonomy axis awareness -- which axes this mode operates on
Output contract -- mandatory structured output format with provenance
Scope -- analyze the ENTIRE project, not just code
Thinking directive -- "Think hard about this. Apply your framework rigorously."

See PROMPTS.md for the full prompt template and per-mode customization.

The Mode Assignment Block (include in every prompt)

YOUR REASONING MODE: [Mode Name] ([Code])
Category: [Category]
Taxonomy Axes: [Which axes this mode operates on]

[Full mode description from taxonomy]

WHAT YOU PRODUCE:
[Mode-specific outputs]

BEST APPLIED TO:
[List of best-for scenarios]

WATCH OUT FOR (your failure modes):
[List of failure modes for this reasoning approach]

WHAT MAKES THIS MODE UNIQUE:
[Differentiator]

COMPLEMENTARY MODES IN THIS ENSEMBLE:
[List of other modes in this run and how they relate to yours]

The Output Contract (include in every prompt)

Every agent MUST produce a file named MODE_OUTPUT_[MODE_ID].md containing:

# [Mode Name] ([Code]) Analysis

## Thesis
One-paragraph summary of this mode's core finding about the project.

## Top Findings
1. [Finding with evidence and reasoning]
2. ...
(5-8 findings. Every one must cite specific evidence.)

Each finding MUST include:
- §F[N]: Finding ID for cross-referencing (e.g., §F1, §F2)
- Evidence: specific file, function, line, or document reference
- Reasoning chain: how this mode specifically reveals this
- Severity: critical / high / medium / low (calibrated against deployment context, not theoretical worst case)
- Confidence: 0.0-1.0 for this specific finding
- So What?: "If the project owner reads this, what specifically would they do differently tomorrow?" (findings without a concrete next-day action are demoted to "observations")

5-8 findings. Quality over quantity. Every finding must cite specific evidence.

## Risks Identified
- [Risk with severity and likelihood assessment]

## Recommendations
- [Actionable recommendation with justification]
- Priority: P0-P4 (same scale as beads)
- Effort: low / medium / high
- Expected benefit description

## New Ideas and Extensions
- [Proposed improvement or extension with rationale]
- Innovation score: incremental / significant / radical

## Assumptions Ledger
- [Unstated assumption this analysis depends on]
- [Assumption the PROJECT makes that this mode questions]

## Questions for Project Owner
- [Question that this analytical lens raises]

## Points of Uncertainty
- [Where this mode's analysis is uncertain and why]

## Agreements and Tensions with Other Perspectives
- [What you expect other modes might agree/disagree with]

## Confidence: [0.0-1.0]
Calibration note: [brief justification -- what would change your confidence?]

Phase 4: Monitor the Swarm

Monitoring Cron (fires every 3 minutes)

Set up immediately after dispatch:

CronCreate(
  cron: "*/3 * * * *",
  recurring: true,
  prompt: "Check the modes-of-reasoning swarm for $PROJECT. Run these commands:
    1. ntm --robot-is-working=$PROJECT
    2. ntm --robot-tail=$PROJECT --lines=80
    3. ls -la MODE_OUTPUT_*.md 2>/dev/null

  For each agent, determine:
  (a) Working / idle / stuck / rate-limited?
  (b) Has it produced its MODE_OUTPUT_*.md file?
  (c) If output exists, is it substantive or superficial?

  ACTIONS:
  - If agent idle + no output: send nudge with mode-specific reminder
  - If agent idle + output exists: agent is done, note completion
  - If agent stuck (same output for 2+ checks): send depth nudge
  - If output superficial (< 3 findings, no evidence): send depth nudge
  - If all agents done or hard timeout reached: cancel this cron and report

  QUALITY ASSESSMENT on each check:
  - Read what agents are actually producing, not just status
  - Are findings citing specific evidence (files, lines, docs)?
  - Is the mode's unique lens visible in the analysis?
  - Flag any agent that is describing the mode instead of applying it

  EARLY STOPPING HEURISTIC:
  - If 8+ agents have produced substantive output and remaining agents show no progress for 2 consecutive checks, proceed to collection.

  Report concisely: N done, N working, N stuck, quality observations."
)

Save the cron job ID for later cancellation.

Nudge Prompts for Stuck Agents

Generic nudge (idle, no output):

ntm send $PROJECT --pane=$N "You are analyzing this project through the lens of [MODE_NAME]. Go deeper. Don't just describe what you see -- apply your specific analytical framework to find things other perspectives would miss. Write your findings to MODE_OUTPUT_[MODE_ID].md with all required sections."

Depth nudge (superficial output):

ntm send $PROJECT --pane=$N "Your MODE_OUTPUT_[MODE_ID].md needs more depth. Each finding must cite specific evidence (file paths, function names, line numbers, document sections). Your analysis should reveal insights that ONLY the [MODE_NAME] lens can see. Rewrite with substantially more analytical depth and at least 8 findings."

Completion nudge (nearly done):

ntm send $PROJECT --pane=$N "Finalize MODE_OUTPUT_[MODE_ID].md. Ensure: (1) every finding has a §F[N] ID and evidence, (2) the Assumptions Ledger is filled in, (3) your confidence is calibrated honestly, (4) the 'Agreements and Tensions' section predicts where other modes will agree/disagree with you."

Early Stopping Heuristics

Borrowed from NTM's ensemble pipeline:

Velocity tracking: If an agent's findings-per-minute drops below 0.5 for 6+ minutes, it's hitting diminishing returns
Similarity detection: If a new finding duplicates an existing one (same file, same issue), the mode is saturated
Output completeness: If all mandatory sections are filled and confidence is calibrated, the agent is done
Don't wait for perfection -- 8/10 substantive outputs is enough to proceed

Phase 5: Collect and Score Outputs

Once agents have produced output (or after timeout):

Cancel the monitoring cron: CronDelete(id: $JOB_ID)
Capture final pane state:
```
ntm --robot-tail=$PROJECT --lines=200
```
Collect all output files:
```
ls -la MODE_OUTPUT_*.md
```
Read every output file completely. Do not skim. You need deep understanding for synthesis.
Score each mode's contribution using these metrics:
- Findings count: total findings produced
- Unique findings: findings no other mode produced
- Evidence quality: proportion of findings with specific file/line references
- Calibration quality: does the confidence score match the evidence quality?
- Mode fidelity: is the mode's unique analytical lens visible in the output?

Compute contribution scores (from NTM's ensemble scoring):

score = 0.40 * (findings_produced / total_findings)
      + 0.30 * (unique_insights / total_unique)
      + 0.20 * (evidence_quality_rating)
      + 0.10 * (calibration_quality_rating)

Build a provenance chain for each finding:

§F[N] → discovered by [Mode] → deduplicated with [§F[M]] → cited in [report section]

Phase 5.5: Ground Truth Verification

Before synthesis, spot-check the top 3-5 highest-impact findings. This phase prevents the single most common failure mode observed across three real-world deployments: agents making claims (especially "zero callers," "dead code," "unused module") that turn out to be wrong because grep-based verification missed indirect call paths.

For each candidate high-impact finding:

Identify the specific evidence claim (e.g., "zero callers for SpectralHealthMonitor")
Run an independent verification using a different methodology than the agent used:
- If the agent grepped for module imports → search for type names, constructor calls (::new), method calls
- If the agent did static analysis → run tests (cargo test, npm test, etc.) to verify dynamic behavior
- If the agent checked file sizes → read the actual code to understand what it does
- If the agent claimed "unused" → check for conditional compilation, re-exports, trait-mediated indirection, test-only usage
If verification fails, downgrade to HYPOTHESIS regardless of how many modes agreed on it

Why this phase exists: In one deployment, 4 modes agreed code was "dead" (30-40K lines). Actual dead code was ~15-18K. The agents all used the same methodology (grep for use statements) and missed integration through indirect paths. The "convergence" was methodological monoculture, not independent confirmation. In another deployment, modules marked "zero callers" were actively wired into the scheduler governor. This phase would have caught both failures.

Dynamic Analysis Requirement

For software projects, at least one verification step in Phase 5.5 must involve running the code — not just reading it. Execute cargo test, npm test, go test, or equivalent. Claims about runtime behavior, performance, or "reachability" that are based purely on static analysis have a documented false-positive rate of ~40% in complex codebases.

Phase 6: Synthesize the Report

This is the most important phase. You are now the synthesis agent. Read REPORT_TEMPLATE.md for the full template and SYNTHESIS_METHODOLOGY.md for the triangulation process.

The Triangulation Protocol

Adapted from operationalizing-expertise methodology:

3+ modes agree on a finding via DIFFERENT evidence → KERNEL (high confidence). Independent discovery via different analytical frameworks AND different evidence pathways is the strongest signal. If all modes cited the same file sizes or import counts, that is ONE observation seen by many — not independent triangulation. KERNEL requires at least 2 distinct evidence methodologies.
2 modes agree → SUPPORTED. Include with note on limited cross-validation.
1 mode only → HYPOTHESIS. Present as unique insight. Valuable if evidence is strong.
Modes actively disagree → DISPUTED. Document both positions with full reasoning chains. Apply conflict resolution (see below).

Conflict Resolution Strategies

When modes disagree, diagnose WHY before resolving:

Diagnosis	Example	Resolution Strategy
Different evidence examined	One read code, another read docs	Both may be right in their domain -- combine
Different values prioritized	Security vs usability tradeoff	Present as genuine tradeoff, do not pick winner
Different assumptions	Current scale vs 10x growth	Present conditionally: "If X, then A; if Y, then B"
Analytical error	Mode misread the code	Identify error, discount that finding, note in Mode Performance
Genuine contradiction	"This is secure" vs "This is exploitable"	Investigate deeper yourself; one is wrong
Different axes	Descriptive vs normative	Both are correct; they're answering different questions

Explanation Layer

For every conclusion in the report, record:

Source modes: which modes contributed
Source findings: specific §F[N] references
Confidence basis: why this confidence level
Supporting evidence: what backs it up
Counter-evidence: what pushes against it
Conflict resolution method: if modes disagreed, how was it resolved (consensus / majority / weighted / adversarial / deferred)

The Final Report

Write MODES_OF_REASONING_REPORT_AND_ANALYSIS_OF_PROJECT.md (or custom --output path).

The report MUST contain these sections (see REPORT_TEMPLATE.md):

Executive Summary -- 1 page max, 3-5 key takeaways
Methodology -- which 10 modes were selected, which axes drove selection, rationale
Taxonomy Axis Analysis -- which axes matter most for this project and what the swarm revealed about each
Convergent Findings (Kernel) -- 3+ mode agreement, highest confidence, with provenance
Supported Findings -- 2-mode agreement
Divergent Findings -- points of disagreement with full reasoning chains for each position
Unique Insights by Mode -- per-mode findings no other mode caught (the value of diversity)
Risk Assessment -- aggregated risks with severity, likelihood, and agreement level
Recommendations -- prioritized with supporting mode count and effort estimates
New Ideas and Extensions -- innovations scored as incremental/significant/radical
Assumptions Ledger -- project assumptions surfaced across all modes
Open Questions -- unresolved questions for the project owner
Confidence Matrix -- per-finding confidence with supporting/dissenting modes
Contribution Scoreboard -- per-mode scores, diversity metric, coverage analysis
Mode Performance Notes -- which modes were most/least productive and why
Mode Selection Retrospective -- would you choose different modes with hindsight?
Appendix: Individual Mode Outputs -- full text or summary of each mode's analysis
Appendix: Provenance Index -- finding ID → source mode → report location mapping

Phase 7: Operationalize (Optional)

If the user wants to ACT on findings rather than just read them:

Create beads for top recommendations:

br create --title="[Recommendation title]" --type=task --priority=[P0-P4]

Offer follow-up swarm runs:
- "Would you like a deeper dive on any specific finding?"
- "Should I run a focused swarm on the disputed findings?"
- "Want me to create an implementation plan for the top 3 recommendations?"
Operationalize recurring insights using the operationalizing-expertise skill:
- Extract operator cards from findings that represent reusable analytical moves
- Build a project-specific kernel of confirmed principles
- Create validation gates based on the risks identified
Post-implementation feedback loop (after the top 3-5 recommendations are implemented):
- Assess: "Did implementing this reveal any issues with the original analysis?"
- Were any findings harder/easier than expected?
- Did implementation uncover inaccuracies in the mode outputs? (e.g., "zero callers" claim disproved when wiring the code)
- Which mode's findings held up best under implementation? Feed this back into mode selection for future runs.
- Record any false-positive patterns (e.g., "don't recommend abstracting the core substrate") for the next analysis of this project.

Progress Artifact (Crash Recovery)

Write and update MODES_ANALYSIS_PROGRESS.md after each phase completes:

# Modes of Reasoning Analysis Progress

## Status: [Phase N: Description]
## Started: [timestamp]
## Project: [PROJECT]

## Phase 0: Context Pack
- [x] Project profiled
- [x] Context pack built

## Phase 1: Mode Selection
- [x] Axes identified: [list]
- [x] 10 modes selected: [list with codes]

## Phase 2: Spawn
- [x] Session: $PROJECT
- [x] Agents: N cc + M cod

## Phase 3: Dispatch
- [x] Pane 0: [mode] - dispatched
- [x] Pane 1: [mode] - dispatched
- ...

## Phase 4: Monitor
- [ ] Cron ID: $JOB_ID
- Pane 0: [mode] - [working/done/stuck]
- Pane 1: [mode] - [working/done/stuck]
- ...

## Phase 5: Collect
- [ ] All outputs collected
- [ ] Contributions scored

## Phase 6: Synthesize
- [ ] Report written

## Recovery Notes
[If resuming after interruption, state what phase to continue from]

Update this after EVERY phase transition. If context compacts or session interrupts, re-read this file to resume.

Circuit Breakers

Stop and reassess if ANY of these trigger:

Breaker	Threshold	Action
Total time	90 minutes (exhaustive) / 45 min (deep) / 15 min (quick)	Collect whatever exists, synthesize
Consecutive stuck agents	3+ agents show no progress for 9 minutes	Cancel cron, collect, synthesize with what exists
All agents rate-limited	0 agents producing output	Stop immediately, synthesize with what exists
Repeated identical errors	Same error in 3+ agents	Diagnose root cause before continuing
Context approaching limit	80% context consumed	Stop monitoring, proceed to synthesis

Lead Agent Operator Cards

These are YOUR cognitive moves during orchestration. Apply them deliberately.

⊘ Axis Scan

When: Phase 1, before mode selection Action: Identify which 2-3 taxonomy axes matter most. If all your modes sit on one side, you have an echo chamber. Failure mode: Treating all axes as equally important (they rarely are).

⊕ Cross-Pollinate

When: Phase 6, during synthesis Action: Take a finding from one mode and explicitly check whether any OTHER mode's framework would predict or contradict it. Failure mode: Only comparing modes that are already similar.

✂ Kill Thesis (Strengthened)

When: Phase 6, for each convergent finding Action: Actively try to find a reason the convergent finding is WRONG. For code-related findings, this means mandatory counter-search: if a finding claims "zero callers" or "dead code," search for type names, constructor calls, method names, re-exports, conditional compilation, and test usage — not just module imports. For recommendations, apply the Identity Check and Senior Engineer Gut Check. If you can't kill it, confidence is justified. Failure mode: Rubber-stamping convergence because multiple modes agree. Multiple modes can share the same methodology and therefore the same blind spot.

🏗 Identity Check

When: Phase 6, before any recommendation to "abstract X" or "decouple from X" Action: Ask: "Would removing X change what this project IS?" If yes, coupling to X is a feature, not a bug, and the recommendation should be filtered out. A Named Tmux Manager should not abstract away tmux. A Rust filesystem should not be ported to Go. Check the core substrate recorded in Phase 0. Failure mode: Treating all dependencies as incidental. The core substrate is not a dependency to abstract — it IS the product.

👷 Senior Engineer Gut Check

When: Phase 6, after synthesis draft, before finalizing Action: Role-play a senior engineer who built this system and uses it daily. For each finding, ask: "Would they agree with this finding and its severity?" For each recommendation, ask: "Would they accept this, or would they immediately say 'that's not how this works'?" The tmux-abstraction recommendation, the "scope is 5x too large" claim, and the "overengineered" verdict all failed this check in real deployments. Failure mode: Analyzing from the outside consultant perspective without internalizing the builder's perspective.

𝓛 Level Check

When: Phase 6, for divergent findings Action: Check if disagreeing modes are operating at different levels of abstraction. Often "disagreements" are about different things. Failure mode: Forcing a resolution when the modes are answering different questions.

⊞ Blind Spot Scan

When: Phase 6, after initial synthesis Action: For each of the 12 categories NOT represented in your 10 modes, ask: "What would a [Category] mode have found that we missed?" Failure mode: Only looking at what modes found, never at what was structurally invisible.

ΔE Evidence Delta

When: Phase 5, while scoring contributions Action: For each finding, assess: "If this evidence didn't exist, would the finding still hold?" Findings that survive evidence removal are structural insights. Failure mode: Weighting all evidence equally regardless of independence.

See LEAD_AGENT_PLAYBOOK.md for the full operator card set.

Anti-Patterns

Mode Echo Chamber

Problem: Selecting modes that all say the same thing (e.g., 5 formal modes). Fix: Enforce diversity across at least 5 of 12 categories AND at least 3 of 7 axes.

Shallow Mode Application

Problem: Agent describes the mode instead of applying it. Fix: The prompt must say "apply this framework" not "describe this framework." Send depth nudge.

Synthesis by Concatenation

Problem: The report just pastes outputs end-to-end without actual synthesis. Fix: The triangulation protocol forces real analytical work: kernel/supported/hypothesis/disputed.

Premature Consensus

Problem: Declaring agreement when modes just didn't look at the same things. Fix: Distinguish "agreement" (same claim, independent evidence) from "non-contradiction" (different topics).

Mode Cargo-Culting

Problem: Forcing a mode to produce findings when it genuinely has nothing relevant to say. Fix: It's fine for a mode to report "this lens has limited applicability here" with reasoning.

Frame Lock

Problem: The lead agent locks onto the first interesting finding and stops exploring. Fix: Apply ⊞ Blind Spot Scan operator. Force yourself to consider what's missing.

Values Laundering

Problem: Normative conclusions disguised as descriptive findings (the project "should" do X presented as "the project has a problem"). Fix: Check the descriptive/normative axis. Separate "what is true" from "what we value."

Narrative Closure

Problem: Building a satisfying story that ignores disconfirming evidence. Fix: For every convergent finding, apply ✂ Kill Thesis. If you can't kill it, it's real.

Methodological Convergence (False KERNEL)

Problem: Multiple modes agree on a finding, but all derive their evidence from the same methodology (e.g., grepping for imports, counting LOC). Agreement reflects shared methodology, not independent confirmation. In one deployment, 4 modes all grepped for use statements and agreed on "30-40K dead lines" — actual dead code was ~15-18K. Fix: For KERNEL findings, require at least 2 DISTINCT evidence methodologies. "4 modes all grepped and found zero callers" is one methodology used 4 times. Require at least one mode to verify through a different path (reading call sites, tracing imports, checking runtime behavior, running tests).

Core Substrate Blindness

Problem: Recommending to "abstract away" the project's core identity/substrate. A Named Tmux Manager shouldn't abstract tmux. A Rust filesystem shouldn't decouple from Rust. Fix: Apply 🏗 Identity Check from Phase 0. Before any decoupling recommendation, ask: "Would removing X change what this project IS?" If yes, filter the recommendation.

Restated Known Limitations

Problem: Modes "discover" things the developer already documented in README.md, AGENTS.md, or issue trackers. The report spends 30% of its length telling the developer what they already wrote. Fix: Extract known limitations in Phase 0. Tag findings that merely restate documented issues as "Confirmed Known Risk" rather than presenting them as discoveries. Weight novel insights higher in scoring.

Severity Theater

Problem: Rating findings as CRITICAL using a worst-case threat model that doesn't match reality. A localhost-only developer tool rated as having "CRITICAL" API security when the attacker would need shell access to exploit it. Fix: All severity ratings MUST reference the actual deployment context from Phase 0. Before rating above MEDIUM, state the deployment context and explain why the severity applies to THAT context.

Unactionable Counterfactuals

Problem: Analyzing irreversible decisions ("What if this had been written in Rust?") that produce zero actionable output for an 85K-line Go project. Fix: Constrain counterfactual mode to decisions that are still reversible at reasonable cost. Skip language choices, core substrate selection, and other decisions where the project cannot realistically change course.

Transient State Over-Interpretation

Problem: Treating a snapshot observation (e.g., "build is broken on main") as evidence of systemic quality erosion, confirmation bias, or testing failure. A broken build is likely one bad commit — a transient state, not a structural property. Fix: Before using a snapshot observation as evidence, ask: "Is this a transient state or a structural property?" Broken builds, temporary test failures, and in-progress refactors are transient. Code architecture, dependency patterns, and API design are structural. Don't cite transient states as evidence for structural claims.

Analysis Theater

Problem: Frameworks that add rigorous-looking structure (FMEA tables with calculated RPNs, probability matrices) but reach the same conclusions that anyone would after 5 minutes of thought. The framework adds ceremony without producing substantially different conclusions from common sense. Fix: For each structured analysis output, ask: "Would someone with 5 minutes and no framework reach a different conclusion?" If the answer is no, the framework added ceremony, not insight. Report the conclusion directly without the framework scaffolding. Reserve quantitative frameworks for cases where they actually change the rank ordering or surface non-obvious interactions.

Penalizing Honest Documentation

Problem: Modes flag transparent self-assessment in project documentation as a red flag. E.g., a FEATURE_PARITY.md that explicitly documents its interpretation rules gets flagged as "self-referential" or "deceptive" by multiple modes — when the developer put the caveat in the same file deliberately to be transparent. Fix: Cross-reference findings with project documentation. If the developer already disclosed the limitation you're flagging, that's transparency, not deception. Penalizing honesty creates perverse incentives. Classify these as "Confirmed Known Risk," not discoveries.

Meta-Infinite Regress

Problem: Spending more time reasoning about reasoning than actually reasoning. Fix: Circuit breakers. Meta-reasoning serves the analysis, not the reverse.

Ritual Calibration

Problem: Every finding rated 0.7 confidence (lazy default). Fix: Require calibration justification: "What would change this to 0.5? What would make it 0.9?"

Troubleshooting

Problem	Fix
Agent ignores its mode assignment	Resend with stronger framing; put mode name in ALL CAPS in the prompt
Agent doesn't write output file	Send explicit: "Write MODE_OUTPUT_X.md NOW with all sections"
Too few agents available	Reduce to available count; prioritize diverse mode selection across axes
Agents editing project code	Clarify: "This is ANALYSIS ONLY. Do not modify project files."
Report feels superficial	Read mode outputs more carefully; apply ⊕ Cross-Pollinate operator
All modes agree on everything	Either the project is simple or modes weren't diverse enough on axes
Agent produces generic analysis	Mode assignment wasn't specific enough; resend with failure modes and differentiator emphasized
Synthesis takes too long	Use the quick-synthesis variant: kernel + top disagreements + recommendations only
Context compaction mid-analysis	Read MODES_ANALYSIS_PROGRESS.md to resume from last completed phase
One mode dominates the report	Check contribution scores; weight by unique insights, not raw finding count

Reference Index

Topic	Reference
Full 80-mode taxonomy with descriptions, failure modes, and selection heuristics	MODE_TAXONOMY.md
The 7 taxonomy axes with per-axis mode mapping and selection guidance	TAXONOMY_AXES.md
Proven mode combinations, stacks for problem types, hybrid reasoning patterns	MODE_COMPOSITION.md
Master prompt template, per-mode customizations, nudge prompts, thinking directives	PROMPTS.md
Triangulation protocol, conflict resolution, contribution scoring, provenance tracking	SYNTHESIS_METHODOLOGY.md
Full report template with all 18 sections and synthesis guidelines	REPORT_TEMPLATE.md
Lead agent operator cards, cognitive moves, decision framework for orchestration	LEAD_AGENT_PLAYBOOK.md
Swarm configuration, monitoring protocol, quality assessment, non-software projects	OPERATIONS.md
Using modes for radical innovation, creative ideation, and conceptual breakthroughs	CREATIVE_APPLICATIONS.md

NTM Installation

If NTM is not available on the current machine, install it before proceeding:

curl -fsSL "https://raw.githubusercontent.com/Dicklesworthstone/ntm/main/install.sh?$(date +%s)" | bash -s -- --easy-mode

NTM is required for this skill. Do not attempt to run it without NTM.

Related Skills

ntm for NTM command reference
vibing-with-ntm for swarm orchestration patterns
code-review-gemini-swarm-with-ntm for similar swarm-based review
operationalizing-expertise for distilling findings into reusable artifacts
multi-model-triangulation for cross-validating with different AI models
codebase-audit for domain-parameterized auditing
alien-artifact-coding for formal guarantees and mathematical optimization

name	modes-of-reasoning-project-analysis
description	Multi-perspective project analysis via NTM reasoning-mode agent swarm. Use when "modes of reasoning", "multi-perspective analysis", or "epistemological review" of any project.

Modes of Reasoning Project Analysis

Core idea: Spawn 10 agents (default), each assigned a distinct reasoning mode from the 80-mode taxonomy, to analyze a project from radically different analytical lenses. Synthesize their outputs into a single comprehensive report with provenance-tracked findings, triangulated consensus, and scored contributions.

Scope: Not just software bugs. This finds methodological flaws, logical fallacies, misconceptions, reasoning errors, architectural blind spots, missing perspectives, and also generates new ideas, extensions, and radical innovations.

You are the lead agent. You select modes, orchestrate the swarm, monitor progress, apply early-stopping heuristics, collect outputs, triangulate consensus, resolve disagreements with explicit conflict-resolution strategies, score contributions, and compile the final report. The swarm agents do the deep analysis; you do the meta-reasoning and synthesis.

Philosophical foundation: There is no single correct way to reason. "Mode of reasoning" refers to at least four overlapping concepts: inference patterns (deduction vs abduction), uncertainty representations (probability vs fuzzy membership), problem-solving methods (planning, optimization), and domain styles (scientific, legal, ethical). Real-world reasoning is inherently hybrid. The power of this skill comes from making the hybrid explicit and orchestrating diversity with discipline.

The Seven Taxonomy Axes

Axis	Pole 1	Pole 2	Selection Impact
Ampliative vs Non-ampliative	Conclusions go beyond premises (induction, abduction)	Conclusions contained in premises (deduction)	Discovery needs ampliative; assurance needs non-ampliative
Monotonic vs Non-monotonic	Adding info never retracts conclusions	Adding info can retract conclusions	Most real reasoning is non-monotonic; pure math is monotonic
Uncertainty vs Vagueness	Uncertainty about crisp facts (probability)	Vagueness in predicates themselves (fuzzy)	Don't treat "tall" as a probability problem
Descriptive vs Normative	What is (facts, causes)	What ought (values, duties)	Decisions fail when values hide in "facts"
Belief vs Action	What to believe/accept	What to do/choose	Separating belief from decision improves accountability
Single-agent vs Multi-agent	Uncertainty as noise	Others respond strategically	Security, markets, governance need multi-agent modes
Truth vs Adoption	Accuracy/validity	Audience/coordination	Org failures are often rhetorical, not logical

Rule: Your 10 modes MUST span at least 3 axes. If all modes sit on the same side of every axis, you have an echo chamber. See TAXONOMY_AXES.md for full axis analysis.

Arguments

Parse from invocation text. Defaults:

Argument	Default	Description
`--agents=N`	10	Number of agents (max 10, each gets one mode)
`--cc=N`	5	Claude Code agents
`--cod=N`	5	Codex agents
`--gmi=N`	0	Gemini agents (optional, for 3-model triangulation)
`--project=PATH`	cwd	Target project to analyze
`--focus=TOPIC`	(none)	Optional focus area to bias mode selection
`--depth=LEVEL`	deep	Analysis depth: quick (15min), deep (45min), exhaustive (90min+)
`--creative`	false	Bias mode selection toward innovation/ideation
`--adversarial`	false	Bias mode selection toward attack/critique
`--concise`	false	Compact 7-section report (skip appendices, scoreboard, mode perf notes)
`--output=PATH`	`MODES_OF_REASONING_REPORT_AND_ANALYSIS_OF_PROJECT.md`	Report filename

Phase 0: Context Pack Generation

Before selecting modes, build a rich understanding of the project. This prevents thin-context failures where agents analyze superficially because they lack project understanding.

Project Profiling

# 1. Read ALL project docs
cat README.md AGENTS.md CLAUDE.md CONTRIBUTING.md 2>/dev/null
# 2. Language and framework detection
find . -type f \( -name '*.go' -o -name '*.py' -o -name '*.ts' -o -name '*.rs' -o -name '*.java' \) | head -80
# 3. Structure analysis
find . -type d -not -path './.git/*' -not -path './node_modules/*' | head -40
# 4. Recent trajectory
git log --oneline -30
git diff --stat HEAD~10..HEAD 2>/dev/null
# 5. Open issues / beads if available
br ready --json 2>/dev/null || true
# 6. Test posture
find . -name '*_test.*' -o -name '*.test.*' -o -name 'test_*' | wc -l

Thin Context Detection

If ANY of these are true, ask the user for more context before proceeding:

No README or it's < 100 words
No clear entry point identified
Project purpose is ambiguous after reading docs
You cannot identify the primary programming language
The focus area is specified but you don't understand what it means in this project

Deployment Context (MANDATORY)

Before any analysis, capture the project's real-world deployment context. All severity ratings in the report MUST reference this context, not a hypothetical worst case.

Record:

Who runs this? Solo developer, small team, large org?
Where does it run? Localhost-only, internal network, public-facing internet?
Who are the users? Just the developer, a small team, thousands of end-users, zero users (pre-release)?
What is the threat model? What attackers are realistic? What's the actual attack surface?
Development stage: Early prototype, active development, production, maintenance mode?

Why this matters: A finding that is CRITICAL for a public-facing web service may be LOW for a localhost developer tool. A recommendation that makes sense for a 10-person team is absurd for a solo developer in early development. Three post-mortems showed severity inflation when deployment context was ignored — e.g., rating a localhost API as "CRITICAL" risk when the attacker would need shell access to exploit it (at which point they don't need the API).

Known Limitations Pre-Filter

Before dispatch, extract all documented limitations, caveats, and acknowledged risks from:

README.md, AGENTS.md, CLAUDE.md
Issue trackers, CHANGELOG, TODO files
Inline comments flagging known issues

Project Identity Check

Identify the project's core substrate — the dependency or concept that defines what the project IS, not just what it uses:

Identity test: "Would removing X change what this project IS?" If yes, coupling to X is a feature, not a bug.

Examples:

A "Named Tmux Manager" — tmux is the identity, not an incidental dependency
A "Rust filesystem" — Rust is the identity
A "React dashboard" — React is a tool choice, not the identity (usually)

Project Values and Rules

Read the project's stated values and rules (from AGENTS.md, CLAUDE.md, README). Record them. Examples:

"No tech debt, do it right" → modes penalizing complexity should have their severity discounted
"No premature abstraction" → filter recommendations that add abstraction layers
"Ship fast, iterate" → weight practical modes higher than formal ones

These values inform mode selection (Phase 1) and filter recommendations (Phase 6).

Context Pack (record for later)

Write a mental note (or scratch file) capturing:

Project brief: name, purpose, languages, frameworks, LOC estimate
Architecture: key components, data flow, integration points
Recent trajectory: what has been changing, what direction
Known concerns: from docs, issues, or user's focus area
Stakeholders: who uses this, who maintains it, who depends on it
Success criteria: what "good" looks like for this project
Deployment context: from above (who, where, users, threat model, stage)
Core substrate: from identity check
Project values: from rules extraction
Known limitations: list of owner-acknowledged issues

This context pack will be embedded in every agent's prompt.

Phase 1: Mode Selection

This is the critical step. You must select 10 modes that maximize analytical coverage for THIS specific project. Do not pick randomly.

The Selection Algorithm

Identify the 2-3 most important taxonomy axes for this project (see table above)
Read the full taxonomy in MODE_TAXONOMY.md
Check mode composition stacks in MODE_COMPOSITION.md for proven combinations
Select modes ensuring:
- At least 5 of 12 categories (A-L) represented
- At least 3 of 7 taxonomy axes spanned
- At least 2 modes that directly oppose each other (e.g., worst-case vs option-generation)
- At least 1 meta-reasoning mode (L category) for calibration
- If --creative: at least 3 from B (Ampliative) category
- If --adversarial: at least 3 from F+H (Causal + Strategic) categories
Check against project values -- read the project's stated rules from Phase 0. If the project says "no premature abstraction," bias away from modes that tend to recommend decoupling (B9 Simplicity). If the project says "do it right, no tech debt," don't over-weight modes that penalize complexity.
Record your rationale -- which axes drove the selection, which modes were considered but excluded

Selection Criteria

Project type -- a web app needs different modes than a compiler or a research paper
Known risks -- if the project has security concerns, include adversarial modes
Focus area -- if --focus is specified, bias selection toward relevant categories
Diversity -- spread across at least 5 of the 12 categories (A-L) to avoid blind spots
Complementarity -- pair modes that challenge each other (e.g., deductive + inductive, worst-case + option-generation)
Axis coverage -- ensure both poles of the most important axes are represented
Model diversity -- assign modes to different agent types (Claude/Codex/Gemini) to exploit model-level analytical differences

Recommended Starting Distributions

Software project (general):

Mode	Code	Why	Axis Coverage
Systems-Thinking	F7	See the whole architecture	Descriptive
Root-Cause	F5	Find underlying problems, not symptoms	Descriptive, causal
Deductive	A1	Verify logical consistency of design	Non-ampliative, monotonic
Inductive	B1	Pattern-match from observed code to general flaws	Ampliative, non-monotonic
Adversarial-Review	H2	Stress-test assumptions	Multi-agent, action
Failure-Mode	F4	What can go wrong, will go wrong	Action, uncertainty
Edge-Case	A8	Boundary conditions and corner cases	Non-ampliative
Perspective-Taking	I4	See it from users, maintainers, attackers	Multi-agent, adoption
Counterfactual	F3	What if key decisions were made differently?	Ampliative, belief
Debiasing	L2	Catch cognitive biases in the other 9 modes	Meta-reasoning

Research/methodology project: Replace Edge-Case and Failure-Mode with Bayesian (B3) and Ambiguity-Detection (D2).

Creative/product project: Replace Root-Cause and Adversarial with Conceptual-Blending (B8) and Analogical (B6).

See MODE_COMPOSITION.md for 12+ proven mode stacks.

After selecting, record your choices and rationale. You will need this for the report.

Phase 2: Spawn the Swarm

ntm spawn $PROJECT \
  --cc=$NUM_CC --cod=$NUM_COD \
  --no-user \
  --stagger-mode=smart

Wait for agents to be ready:

ntm --robot-wait=$PROJECT --condition=idle --timeout=120

Model-to-Mode Affinity

Not all modes work equally well on all models. Suggested assignments:

Best on Claude (nuanced analysis)	Best on Codex (thorough enumeration)	Either
Systems-Thinking (F7)	Edge-Case (A8)	Root-Cause (F5)
Counterfactual (F3)	Inductive (B1)	Deductive (A1)
Perspective-Taking (I4)	Failure-Mode (F4)	Adversarial (H2)
Debiasing (L2)	Dependency-Mapping (F2)	Option-Generation (B5)
Conceptual-Blending (B8)	Type-Theoretic (A7)	Bayesian (B3)

Phase 3: Dispatch Mode-Specific Prompts

Send each agent a unique, mode-specific prompt. Target agents individually by pane index. Stagger dispatch by 15-20 seconds to prevent thundering-herd effects:

for pane in 0 1 2 3 4 5 6 7 8 9; do
  ntm send $PROJECT --pane=$pane "$(cat <<'PROMPT'
  <INSERT MODE-SPECIFIC PROMPT FROM PROMPTS.md>
  PROMPT
  )"
  sleep 18
done

The prompt for each agent MUST include:

Project context pack -- the brief from Phase 0
Project study directive -- read README, AGENTS.md, understand architecture
Mode assignment -- the specific reasoning mode with full description, failure modes, and differentiator
Taxonomy axis awareness -- which axes this mode operates on
Output contract -- mandatory structured output format with provenance
Scope -- analyze the ENTIRE project, not just code
Thinking directive -- "Think hard about this. Apply your framework rigorously."

See PROMPTS.md for the full prompt template and per-mode customization.

The Mode Assignment Block (include in every prompt)

YOUR REASONING MODE: [Mode Name] ([Code])
Category: [Category]
Taxonomy Axes: [Which axes this mode operates on]

[Full mode description from taxonomy]

WHAT YOU PRODUCE:
[Mode-specific outputs]

BEST APPLIED TO:
[List of best-for scenarios]

WATCH OUT FOR (your failure modes):
[List of failure modes for this reasoning approach]

WHAT MAKES THIS MODE UNIQUE:
[Differentiator]

COMPLEMENTARY MODES IN THIS ENSEMBLE:
[List of other modes in this run and how they relate to yours]

The Output Contract (include in every prompt)

Every agent MUST produce a file named MODE_OUTPUT_[MODE_ID].md containing:

# [Mode Name] ([Code]) Analysis

## Thesis
One-paragraph summary of this mode's core finding about the project.

## Top Findings
1. [Finding with evidence and reasoning]
2. ...
(5-8 findings. Every one must cite specific evidence.)

Each finding MUST include:
- §F[N]: Finding ID for cross-referencing (e.g., §F1, §F2)
- Evidence: specific file, function, line, or document reference
- Reasoning chain: how this mode specifically reveals this
- Severity: critical / high / medium / low (calibrated against deployment context, not theoretical worst case)
- Confidence: 0.0-1.0 for this specific finding
- So What?: "If the project owner reads this, what specifically would they do differently tomorrow?" (findings without a concrete next-day action are demoted to "observations")

5-8 findings. Quality over quantity. Every finding must cite specific evidence.

## Risks Identified
- [Risk with severity and likelihood assessment]

## Recommendations
- [Actionable recommendation with justification]
- Priority: P0-P4 (same scale as beads)
- Effort: low / medium / high
- Expected benefit description

## New Ideas and Extensions
- [Proposed improvement or extension with rationale]
- Innovation score: incremental / significant / radical

## Assumptions Ledger
- [Unstated assumption this analysis depends on]
- [Assumption the PROJECT makes that this mode questions]

## Questions for Project Owner
- [Question that this analytical lens raises]

## Points of Uncertainty
- [Where this mode's analysis is uncertain and why]

## Agreements and Tensions with Other Perspectives
- [What you expect other modes might agree/disagree with]

## Confidence: [0.0-1.0]
Calibration note: [brief justification -- what would change your confidence?]

Phase 4: Monitor the Swarm

Monitoring Cron (fires every 3 minutes)

Set up immediately after dispatch:

CronCreate(
  cron: "*/3 * * * *",
  recurring: true,
  prompt: "Check the modes-of-reasoning swarm for $PROJECT. Run these commands:
    1. ntm --robot-is-working=$PROJECT
    2. ntm --robot-tail=$PROJECT --lines=80
    3. ls -la MODE_OUTPUT_*.md 2>/dev/null

  For each agent, determine:
  (a) Working / idle / stuck / rate-limited?
  (b) Has it produced its MODE_OUTPUT_*.md file?
  (c) If output exists, is it substantive or superficial?

  ACTIONS:
  - If agent idle + no output: send nudge with mode-specific reminder
  - If agent idle + output exists: agent is done, note completion
  - If agent stuck (same output for 2+ checks): send depth nudge
  - If output superficial (< 3 findings, no evidence): send depth nudge
  - If all agents done or hard timeout reached: cancel this cron and report

  QUALITY ASSESSMENT on each check:
  - Read what agents are actually producing, not just status
  - Are findings citing specific evidence (files, lines, docs)?
  - Is the mode's unique lens visible in the analysis?
  - Flag any agent that is describing the mode instead of applying it

  EARLY STOPPING HEURISTIC:
  - If 8+ agents have produced substantive output and remaining agents show no progress for 2 consecutive checks, proceed to collection.

  Report concisely: N done, N working, N stuck, quality observations."
)

Save the cron job ID for later cancellation.

Nudge Prompts for Stuck Agents

Generic nudge (idle, no output):

ntm send $PROJECT --pane=$N "You are analyzing this project through the lens of [MODE_NAME]. Go deeper. Don't just describe what you see -- apply your specific analytical framework to find things other perspectives would miss. Write your findings to MODE_OUTPUT_[MODE_ID].md with all required sections."

Depth nudge (superficial output):

ntm send $PROJECT --pane=$N "Your MODE_OUTPUT_[MODE_ID].md needs more depth. Each finding must cite specific evidence (file paths, function names, line numbers, document sections). Your analysis should reveal insights that ONLY the [MODE_NAME] lens can see. Rewrite with substantially more analytical depth and at least 8 findings."

Completion nudge (nearly done):

ntm send $PROJECT --pane=$N "Finalize MODE_OUTPUT_[MODE_ID].md. Ensure: (1) every finding has a §F[N] ID and evidence, (2) the Assumptions Ledger is filled in, (3) your confidence is calibrated honestly, (4) the 'Agreements and Tensions' section predicts where other modes will agree/disagree with you."

Early Stopping Heuristics

Borrowed from NTM's ensemble pipeline:

Velocity tracking: If an agent's findings-per-minute drops below 0.5 for 6+ minutes, it's hitting diminishing returns
Similarity detection: If a new finding duplicates an existing one (same file, same issue), the mode is saturated
Output completeness: If all mandatory sections are filled and confidence is calibrated, the agent is done
Don't wait for perfection -- 8/10 substantive outputs is enough to proceed

Phase 5: Collect and Score Outputs

Once agents have produced output (or after timeout):

Cancel the monitoring cron: CronDelete(id: $JOB_ID)
Capture final pane state:
```
ntm --robot-tail=$PROJECT --lines=200
```
Collect all output files:
```
ls -la MODE_OUTPUT_*.md
```
Read every output file completely. Do not skim. You need deep understanding for synthesis.
Score each mode's contribution using these metrics:
- Findings count: total findings produced
- Unique findings: findings no other mode produced
- Evidence quality: proportion of findings with specific file/line references
- Calibration quality: does the confidence score match the evidence quality?
- Mode fidelity: is the mode's unique analytical lens visible in the output?

Compute contribution scores (from NTM's ensemble scoring):

score = 0.40 * (findings_produced / total_findings)
      + 0.30 * (unique_insights / total_unique)
      + 0.20 * (evidence_quality_rating)
      + 0.10 * (calibration_quality_rating)

Build a provenance chain for each finding:

§F[N] → discovered by [Mode] → deduplicated with [§F[M]] → cited in [report section]

Phase 5.5: Ground Truth Verification

For each candidate high-impact finding:

Identify the specific evidence claim (e.g., "zero callers for SpectralHealthMonitor")
Run an independent verification using a different methodology than the agent used:
- If the agent grepped for module imports → search for type names, constructor calls (::new), method calls
- If the agent did static analysis → run tests (cargo test, npm test, etc.) to verify dynamic behavior
- If the agent checked file sizes → read the actual code to understand what it does
- If the agent claimed "unused" → check for conditional compilation, re-exports, trait-mediated indirection, test-only usage
If verification fails, downgrade to HYPOTHESIS regardless of how many modes agreed on it

Why this phase exists: In one deployment, 4 modes agreed code was "dead" (30-40K lines). Actual dead code was ~15-18K. The agents all used the same methodology (grep for use statements) and missed integration through indirect paths. The "convergence" was methodological monoculture, not independent confirmation. In another deployment, modules marked "zero callers" were actively wired into the scheduler governor. This phase would have caught both failures.

Dynamic Analysis Requirement

Phase 6: Synthesize the Report

This is the most important phase. You are now the synthesis agent. Read REPORT_TEMPLATE.md for the full template and SYNTHESIS_METHODOLOGY.md for the triangulation process.

The Triangulation Protocol

Adapted from operationalizing-expertise methodology:

3+ modes agree on a finding via DIFFERENT evidence → KERNEL (high confidence). Independent discovery via different analytical frameworks AND different evidence pathways is the strongest signal. If all modes cited the same file sizes or import counts, that is ONE observation seen by many — not independent triangulation. KERNEL requires at least 2 distinct evidence methodologies.
2 modes agree → SUPPORTED. Include with note on limited cross-validation.
1 mode only → HYPOTHESIS. Present as unique insight. Valuable if evidence is strong.
Modes actively disagree → DISPUTED. Document both positions with full reasoning chains. Apply conflict resolution (see below).

Conflict Resolution Strategies

When modes disagree, diagnose WHY before resolving:

Diagnosis	Example	Resolution Strategy
Different evidence examined	One read code, another read docs	Both may be right in their domain -- combine
Different values prioritized	Security vs usability tradeoff	Present as genuine tradeoff, do not pick winner
Different assumptions	Current scale vs 10x growth	Present conditionally: "If X, then A; if Y, then B"
Analytical error	Mode misread the code	Identify error, discount that finding, note in Mode Performance
Genuine contradiction	"This is secure" vs "This is exploitable"	Investigate deeper yourself; one is wrong
Different axes	Descriptive vs normative	Both are correct; they're answering different questions

Explanation Layer

For every conclusion in the report, record:

Source modes: which modes contributed
Source findings: specific §F[N] references
Confidence basis: why this confidence level
Supporting evidence: what backs it up
Counter-evidence: what pushes against it
Conflict resolution method: if modes disagreed, how was it resolved (consensus / majority / weighted / adversarial / deferred)

The Final Report

Write MODES_OF_REASONING_REPORT_AND_ANALYSIS_OF_PROJECT.md (or custom --output path).

The report MUST contain these sections (see REPORT_TEMPLATE.md):

Executive Summary -- 1 page max, 3-5 key takeaways
Methodology -- which 10 modes were selected, which axes drove selection, rationale
Taxonomy Axis Analysis -- which axes matter most for this project and what the swarm revealed about each
Convergent Findings (Kernel) -- 3+ mode agreement, highest confidence, with provenance
Supported Findings -- 2-mode agreement
Divergent Findings -- points of disagreement with full reasoning chains for each position
Unique Insights by Mode -- per-mode findings no other mode caught (the value of diversity)
Risk Assessment -- aggregated risks with severity, likelihood, and agreement level
Recommendations -- prioritized with supporting mode count and effort estimates
New Ideas and Extensions -- innovations scored as incremental/significant/radical
Assumptions Ledger -- project assumptions surfaced across all modes
Open Questions -- unresolved questions for the project owner
Confidence Matrix -- per-finding confidence with supporting/dissenting modes
Contribution Scoreboard -- per-mode scores, diversity metric, coverage analysis
Mode Performance Notes -- which modes were most/least productive and why
Mode Selection Retrospective -- would you choose different modes with hindsight?
Appendix: Individual Mode Outputs -- full text or summary of each mode's analysis
Appendix: Provenance Index -- finding ID → source mode → report location mapping

Phase 7: Operationalize (Optional)

If the user wants to ACT on findings rather than just read them:

Create beads for top recommendations:

br create --title="[Recommendation title]" --type=task --priority=[P0-P4]

Offer follow-up swarm runs:
- "Would you like a deeper dive on any specific finding?"
- "Should I run a focused swarm on the disputed findings?"
- "Want me to create an implementation plan for the top 3 recommendations?"
Operationalize recurring insights using the operationalizing-expertise skill:
- Extract operator cards from findings that represent reusable analytical moves
- Build a project-specific kernel of confirmed principles
- Create validation gates based on the risks identified
Post-implementation feedback loop (after the top 3-5 recommendations are implemented):
- Assess: "Did implementing this reveal any issues with the original analysis?"
- Were any findings harder/easier than expected?
- Did implementation uncover inaccuracies in the mode outputs? (e.g., "zero callers" claim disproved when wiring the code)
- Which mode's findings held up best under implementation? Feed this back into mode selection for future runs.
- Record any false-positive patterns (e.g., "don't recommend abstracting the core substrate") for the next analysis of this project.

Progress Artifact (Crash Recovery)

Write and update MODES_ANALYSIS_PROGRESS.md after each phase completes:

# Modes of Reasoning Analysis Progress

## Status: [Phase N: Description]
## Started: [timestamp]
## Project: [PROJECT]

## Phase 0: Context Pack
- [x] Project profiled
- [x] Context pack built

## Phase 1: Mode Selection
- [x] Axes identified: [list]
- [x] 10 modes selected: [list with codes]

## Phase 2: Spawn
- [x] Session: $PROJECT
- [x] Agents: N cc + M cod

## Phase 3: Dispatch
- [x] Pane 0: [mode] - dispatched
- [x] Pane 1: [mode] - dispatched
- ...

## Phase 4: Monitor
- [ ] Cron ID: $JOB_ID
- Pane 0: [mode] - [working/done/stuck]
- Pane 1: [mode] - [working/done/stuck]
- ...

## Phase 5: Collect
- [ ] All outputs collected
- [ ] Contributions scored

## Phase 6: Synthesize
- [ ] Report written

## Recovery Notes
[If resuming after interruption, state what phase to continue from]

Update this after EVERY phase transition. If context compacts or session interrupts, re-read this file to resume.

Circuit Breakers

Stop and reassess if ANY of these trigger:

Breaker	Threshold	Action
Total time	90 minutes (exhaustive) / 45 min (deep) / 15 min (quick)	Collect whatever exists, synthesize
Consecutive stuck agents	3+ agents show no progress for 9 minutes	Cancel cron, collect, synthesize with what exists
All agents rate-limited	0 agents producing output	Stop immediately, synthesize with what exists
Repeated identical errors	Same error in 3+ agents	Diagnose root cause before continuing
Context approaching limit	80% context consumed	Stop monitoring, proceed to synthesis

Lead Agent Operator Cards

These are YOUR cognitive moves during orchestration. Apply them deliberately.

⊘ Axis Scan

⊕ Cross-Pollinate

✂ Kill Thesis (Strengthened)

🏗 Identity Check

👷 Senior Engineer Gut Check

𝓛 Level Check

⊞ Blind Spot Scan

ΔE Evidence Delta

See LEAD_AGENT_PLAYBOOK.md for the full operator card set.

Anti-Patterns

Mode Echo Chamber

Problem: Selecting modes that all say the same thing (e.g., 5 formal modes). Fix: Enforce diversity across at least 5 of 12 categories AND at least 3 of 7 axes.

Shallow Mode Application

Problem: Agent describes the mode instead of applying it. Fix: The prompt must say "apply this framework" not "describe this framework." Send depth nudge.

Synthesis by Concatenation

Problem: The report just pastes outputs end-to-end without actual synthesis. Fix: The triangulation protocol forces real analytical work: kernel/supported/hypothesis/disputed.

Premature Consensus

Problem: Declaring agreement when modes just didn't look at the same things. Fix: Distinguish "agreement" (same claim, independent evidence) from "non-contradiction" (different topics).

Mode Cargo-Culting

Problem: Forcing a mode to produce findings when it genuinely has nothing relevant to say. Fix: It's fine for a mode to report "this lens has limited applicability here" with reasoning.

Frame Lock

Problem: The lead agent locks onto the first interesting finding and stops exploring. Fix: Apply ⊞ Blind Spot Scan operator. Force yourself to consider what's missing.

Values Laundering

Narrative Closure

Problem: Building a satisfying story that ignores disconfirming evidence. Fix: For every convergent finding, apply ✂ Kill Thesis. If you can't kill it, it's real.

Methodological Convergence (False KERNEL)

Core Substrate Blindness

Restated Known Limitations

Severity Theater

Unactionable Counterfactuals

Transient State Over-Interpretation

Analysis Theater

Penalizing Honest Documentation

Meta-Infinite Regress

Problem: Spending more time reasoning about reasoning than actually reasoning. Fix: Circuit breakers. Meta-reasoning serves the analysis, not the reverse.

Ritual Calibration

Problem: Every finding rated 0.7 confidence (lazy default). Fix: Require calibration justification: "What would change this to 0.5? What would make it 0.9?"

Troubleshooting

Problem	Fix
Agent ignores its mode assignment	Resend with stronger framing; put mode name in ALL CAPS in the prompt
Agent doesn't write output file	Send explicit: "Write MODE_OUTPUT_X.md NOW with all sections"
Too few agents available	Reduce to available count; prioritize diverse mode selection across axes
Agents editing project code	Clarify: "This is ANALYSIS ONLY. Do not modify project files."
Report feels superficial	Read mode outputs more carefully; apply ⊕ Cross-Pollinate operator
All modes agree on everything	Either the project is simple or modes weren't diverse enough on axes
Agent produces generic analysis	Mode assignment wasn't specific enough; resend with failure modes and differentiator emphasized
Synthesis takes too long	Use the quick-synthesis variant: kernel + top disagreements + recommendations only
Context compaction mid-analysis	Read MODES_ANALYSIS_PROGRESS.md to resume from last completed phase
One mode dominates the report	Check contribution scores; weight by unique insights, not raw finding count

Reference Index

Topic	Reference
Full 80-mode taxonomy with descriptions, failure modes, and selection heuristics	MODE_TAXONOMY.md
The 7 taxonomy axes with per-axis mode mapping and selection guidance	TAXONOMY_AXES.md
Proven mode combinations, stacks for problem types, hybrid reasoning patterns	MODE_COMPOSITION.md
Master prompt template, per-mode customizations, nudge prompts, thinking directives	PROMPTS.md
Triangulation protocol, conflict resolution, contribution scoring, provenance tracking	SYNTHESIS_METHODOLOGY.md
Full report template with all 18 sections and synthesis guidelines	REPORT_TEMPLATE.md
Lead agent operator cards, cognitive moves, decision framework for orchestration	LEAD_AGENT_PLAYBOOK.md
Swarm configuration, monitoring protocol, quality assessment, non-software projects	OPERATIONS.md
Using modes for radical innovation, creative ideation, and conceptual breakthroughs	CREATIVE_APPLICATIONS.md

NTM Installation

If NTM is not available on the current machine, install it before proceeding:

curl -fsSL "https://raw.githubusercontent.com/Dicklesworthstone/ntm/main/install.sh?$(date +%s)" | bash -s -- --easy-mode

NTM is required for this skill. Do not attempt to run it without NTM.

Related Skills

ntm for NTM command reference
vibing-with-ntm for swarm orchestration patterns
code-review-gemini-swarm-with-ntm for similar swarm-based review
operationalizing-expertise for distilling findings into reusable artifacts
multi-model-triangulation for cross-validating with different AI models
codebase-audit for domain-parameterized auditing
alien-artifact-coding for formal guarantees and mathematical optimization