Run any Skill in Manus with one click

paw-sot

Stars38

Forks6

UpdatedMarch 15, 2026 at 02:21

Society of Thought (SoT) — multi-perspective deliberative review engine. Orchestrates specialist personas with parallel or debate interaction, confidence-weighted synthesis, and interactive moderator mode. Use for code review, plan review, design analysis, or any scenario requiring diverse expert perspectives.

Installation

Install with Codex or Claude Copy this prompt, paste it into Codex, Claude, or another assistant, and let it review the skill page and install it for you.

Run Skill in Manus

Source

lossyrob

lossyrob/phased-agent-workflow

View GitHub Repository View Creator Repositories

Download

Run Skill in Manus

Related occupationsSOC

Based on SOC occupation classification

Computer Systems AnalystsComputer and Mathematical Occupations·SOC 15-1211

File Explorer

15 files

SKILL.md

readonly

Society of Thought Engine

Execution Context: This skill is loaded into the calling agent's session (not spawned as a subagent), preserving user interactivity for trade-off decisions and moderator mode.

Scenario-agnostic orchestration engine for multi-perspective deliberative review. Discovers specialist personas, spawns them as subagents against review targets, orchestrates parallel or debate interaction, and produces confidence-weighted synthesis.

Capabilities

Specialist discovery with 4-level precedence (workflow → project → user → built-in)
Adaptive specialist selection based on content analysis
Parallel and debate interaction modes
Context-adaptive framing for code, artifact, or freeform review targets
Confidence-weighted synthesis with grounding validation
Post-synthesis moderator mode for deeper specialist engagement

Review Context Input Contract

The calling skill provides a review context describing what to review and how to orchestrate:

Field	Required	Values	Description
`type`	yes	`diff` \| `artifacts` \| `freeform`	What kind of content is being reviewed
`coordinates`	yes	diff range, artifact paths, or content description	What to point specialists at
`output_dir`	yes	directory path	Where to write REVIEW-*.md files
`specialists`	no	`all` (default) \| comma-separated names \| `adaptive:<N>`	Which specialists to invoke
`context`	no	string (type-dependent default — see Context Filtering)	Domain filter for specialists (case-insensitive match)
`interaction_mode`	yes	`parallel` \| `debate`	How specialists interact
`interactive`	yes	`true` \| `false` \| `smart`	Whether to pause for user decisions on trade-offs
`specialist_models`	no	`none` \| model pool \| pinned pairs \| mixed	Model assignment for specialists
`framing`	no	free text	Caller-provided preamble for `freeform` type
`perspectives`	no	`none` \| `auto` (default) \| comma-separated names	Which perspective overlays to apply
`perspective_cap`	no	positive integer (default `2`)	Max perspective overlays per specialist

Context-Adaptive Preambles

Before composing specialist prompts, inject a type-dependent preamble that frames how the specialist should interpret its cognitive strategy and domain for the review target.

Type diff (code/implementation review):

You are reviewing implementation changes (code diff). Apply your cognitive strategy and domain expertise to the actual code changes. Look for bugs, security issues, performance problems, pattern violations, and correctness gaps in the diff. Cite specific file paths and line numbers.

Type artifacts (design/planning review):

You are reviewing design and planning documents, not code. Apply your cognitive strategy to design decisions, architectural choices, and unstated assumptions. Your domain expertise should evaluate whether the planned approach is sound — look for gaps in reasoning, missing edge cases, feasibility risks, and cross-concern conflicts. Reference specific sections and claims in the documents rather than code lines.

Type freeform:

Use the caller-provided framing field content as the preamble. If no framing provided, use a neutral framing: "You are reviewing the provided content. Apply your cognitive strategy and domain expertise to identify issues, risks, and opportunities for improvement."

Specialist Discovery

Discover specialist personas at 4 precedence levels (most-specific-wins for name conflicts):

Workflow: Parse specialists from the review context — if an explicit comma-separated list, resolve only those names
Project: Scan .paw/personas/<name>.md files in the repository
User: Scan ~/.paw/personas/<name>.md files
Built-in: Scan references/specialists/<name>.md files (excluding _shared-rules.md)

Resolution rules:

If specialists is all (default): include all discovered specialists from all levels
If a fixed list (e.g., security, performance, testing): resolve each name against discovered specialists, most-specific-wins
If adaptive:<N>: discover all, then select N most relevant (see Adaptive Selection below)
Same filename at project level overrides user level overrides built-in
Skip malformed or empty specialist files with a warning; continue with remaining roster
If zero specialists found after discovery, fall back to built-in defaults with a warning

Context Filtering

The context field (domain filter) is distinct from the review context input structure.

When the review context includes a context field, filter discovered specialists to only those matching the specified domain. This enables domain-specific reviews where only relevant specialists participate.

Specialist context declaration: Specialists declare their domain via a context field in YAML frontmatter:

---
context: implementation
---

Matching behavior:

Each specialist declares a single context value (comma-separated or array values are not supported)
Case-insensitive string comparison (e.g., Business matches business)

Default behaviors:

Specialists without a context field, or with an empty/whitespace-only context value, default to implementation
Callers without context field:
- diff and artifacts types → default to implementation
- freeform type → no filtering (all specialists participate)

Zero-match handling: If context filtering results in zero matching specialists:

Interactive mode (interactive: true): Warn and prompt user — options: proceed with all specialists, specify different context, or abort
Non-interactive mode (interactive: false): Warn and proceed with all specialists as fallback
Smart mode (interactive: smart): Warn and prompt user (escalate on zero-match — context misconfiguration is more likely a setup error than a soft signal, warranting user confirmation)

Execution order: Context filtering occurs after specialist discovery and before adaptive selection. When both context filtering and adaptive selection are active, the adaptive algorithm selects from the already-filtered pool. When a fixed specialist list is provided, context filtering is skipped — explicit naming overrides domain filtering.

Adaptive Selection

When specialists is adaptive:<N>, select the N most relevant specialists from the discovered roster (after any context filtering) based on content analysis.

Selection process:

Analyze the review target to identify dominant change categories — file types, affected subsystems, nature of changes (new logic, refactoring, config, API surface, data handling, test coverage)
For each discovered specialist, assess relevance by matching the specialist's cognitive strategy and domain against the identified change categories
Select up to N specialists with the highest relevance to the actual changes
Document selection rationale in the REVIEW-SYNTHESIS.md Selection rationale field (e.g., "Selected security, performance, testing — diff adds new API endpoint with database queries and no test coverage")

Edge cases:

If N ≥ number of available specialists, include all (equivalent to all)
If the review target is trivial (e.g., single typo fix, comment-only changes), report the condition to the calling context — the caller decides whether to proceed or fall back to a simpler review mode
If adaptive selection would select 0 specialists (no strong relevance signal):
- Interactive mode (interactive: true): Present the user with options — proceed with all specialists, specify which to include, or report the condition to the calling context for fallback
- Non-interactive mode (interactive: false or smart): Report the condition to the calling context — the caller decides how to proceed

Compatibility: Adaptive selection is orthogonal to interaction mode — works with both parallel and debate.

Perspective Discovery

When perspectives is not none, discover perspective overlay files at 4 precedence levels (mirroring specialist discovery):

Workflow: Parse perspectives from review context — if explicit comma-separated list, resolve only those names
Project: Scan .paw/perspectives/<name>.md files in the repository
User: Scan ~/.paw/perspectives/<name>.md files
Built-in: Scan references/perspectives/<name>.md files

Resolution rules (parallel to specialist discovery):

Most-specific-wins for name conflicts (project overrides user overrides built-in)
Skip malformed perspective files (missing required sections) with a warning; continue with remaining roster
Unknown names: if a name in the comma-separated list matches no discoverable perspective file, warn and skip — consistent with specialist discovery's malformed-file handling
none → skip discovery entirely; engine behaves identically to current implementation with no perspective-related processing
auto → discover all perspectives, then select based on artifact signals (see Perspective Auto-Selection)

Perspective Auto-Selection

When perspectives is auto, analyze the review target and select up to perspective_cap perspectives per specialist:

Analyze artifact type, file types touched, affected subsystems, and estimated complexity
Temporal perspectives (premortem, retrospective): relevant for code changes with operational implications, scaling concerns, dependency management, or long-term maintenance impact
Adversarial perspectives (red-team): relevant for changes touching trust boundaries, external interfaces, authentication, authorization, or data handling
If no perspectives meet relevance threshold: select the built-in baseline perspective and note in synthesis

Selection is budget-aware: total subagent calls = specialists × assigned perspectives. The perspective_cap bounds per-specialist perspective count.

Prompt Composition

Compose the review prompt for each specialist subagent from up to five layers:

Shared rules — load references/specialists/_shared-rules.md once per review run (anti-sycophancy rules, confidence scoring, Toulmin output format)
Context preamble — inject the type-dependent preamble (see Context-Adaptive Preambles above) to frame the specialist's review lens
Specialist content — load the discovered specialist .md file (identity, cognitive strategy, behavioral rules, demand rationale, examples)
Perspective overlay — when a perspective is assigned, load the perspective file, resolve {specialist} placeholder with the specialist's name, and inject as evaluative lens framing. The overlay template includes the **Perspective** field instruction for Toulmin output attribution. When no perspective is assigned (perspectives: none), skip this layer entirely — composition is identical to the pre-change 4-layer model.
Review coordinates — review target location (diff range, artifact paths, or content description) and output directory so the subagent can self-gather context via git diff, view, and grep

Prompt budget overflow: If the composed prompt approaches model context limits, the perspective overlay (layer 4) is the first candidate for truncation, preserving specialist identity and review coordinates. A warning is emitted when truncation occurs. Given overlays are 50–100 words, this edge case is unlikely in practice.

If a specialist file contains shared_rules_included: true in its YAML frontmatter, skip shared rules injection to avoid duplication. Otherwise, always inject shared rules.

Replace [specialist-name] in the shared rules output format with the specialist's actual name (e.g., security, performance).

Execution

Execution depends on interaction_mode:

Parallel Mode (default)

Spawn parallel subagents using task tool with agent_type: "general-purpose". For each specialist (and each assigned perspective):

Compose prompt: shared rules + context preamble + specialist content + perspective overlay (if assigned) + review coordinates
Resolve model using precedence chain (see Model Assignment below)
Instruct the subagent to write its Toulmin-structured findings directly to the output directory:
- With perspective: REVIEW-{SPECIALIST-NAME}-{PERSPECTIVE}.md
- Without perspective (perspectives: none): REVIEW-{SPECIALIST-NAME}.md (backward compatible)
Each subagent's findings are attributed to the perspective that framed its review
The orchestrator receives only a brief completion status (success/failure, finding count) — NOT the full findings content

When perspectives are active, each specialist runs once per assigned perspective. With 2 perspectives selected, a specialist runs twice (once per perspective), producing separate review files.

Model Assignment: Resolve the model for each specialist using this precedence (most-specific-wins):

Specialist frontmatter: If the specialist .md file contains a model: field in YAML frontmatter, use that model
Pinned assignment: If specialist_models contains a specialist:model pair matching this specialist name, use the pinned model
Model pool: If specialist_models contains unpinned model names, distribute them round-robin across unpinned specialists (sort specialists alphabetically, cycle through pool list)
Session default: If none of the above apply, use the session's default model

If a specified model is unavailable or invalid, fall back to session default with a user-visible warning.

Log the specialist→model assignment map at review start so users can verify the distribution.

After all specialists complete, proceed to synthesis.

Debate Mode

Thread-based multi-round debate where findings become discussion threads with point/counterpoint exchanges. Produces richer evidence than parallel mode at the cost of more subagent calls.

Edge case: If only 1 specialist is selected, skip debate and use parallel mode (debate requires ≥2 specialists).

Round 1 (Initial sweep): Run all specialists in parallel (same as parallel mode, including perspective-aware execution if perspectives are active). Each finding becomes a thread with state open. When perspectives are active, perspective-variant findings from the same specialist participate as distinct positions — thread attribution includes both specialist name and perspective name.

Rounds 2–3 (Threaded responses): After each round, the synthesis agent (operating as PR triage lead) generates a round summary organized by thread:

For each thread: current state (open, agreed, contested), summary of positions, open questions
When perspectives are active, explicitly contrast findings across different perspectives from the same specialist to ensure the debate loop addresses intra-specialist perspective disagreements
This summary is the only inter-specialist communication (hub-and-spoke — specialists never see each other's raw findings)

Re-run specialists with: shared rules + context preamble + specialist content + perspective overlay (if assigned) + review coordinates + round summary (embedded — this is the only new information per round). Specialists can:

Refine their position on existing threads (cite thread ID)
Respond to summarized counterarguments
Add new threads (new findings become new open threads)

Each specialist appends round output to its existing review file (REVIEW-{SPECIALIST-NAME}.md or REVIEW-{SPECIALIST-NAME}-{PERSPECTIVE}.md when perspectives are active).

Adaptive termination: After each round, the synthesis agent evaluates whether new substantive findings emerged. If no new threads and no position changes on existing threads, global rounds terminate early. Hard cap: 3 global rounds.

Per-thread continuation (contested threads only): After global rounds close, threads still marked contested enter targeted continuation:

Only the specialists involved in the contested thread participate (2–3 specialists, not the full roster)
Max 2 additional exchanges per thread (total cap: 5 exchanges including global rounds)
Aggregate budget: Max 30 subagent calls across all continuation threads
Synthesis agent monitors each thread for convergence, deadlock, or trade-off identification
Trade-off detection: If a contested thread represents a genuine design trade-off (not a factual dispute), classify as trade-off and exit continuation — flag for user decision (interactive) or conservative default resolution (auto)

Thread states: open → agreed | contested | trade-off | resolved

User escalation (interactive/smart mode): When the orchestrating skill identifies a trade-off thread from the round summary, pause and present the trade-off to the user with both sides' evidence and positions, ask for a decision, then continue with the user's decision as context. In auto mode, apply conservative defaults (priority hierarchy: Correctness > Security > Reliability > Performance > Maintainability > Developer Experience) and flag in REVIEW-SYNTHESIS.md.

After all threads resolve or budget is exhausted, proceed to synthesis.

Synthesis

Spawn a synthesis subagent (agent_type: "general-purpose") that reads all REVIEW-*.md files in the output directory (covering both REVIEW-{SPECIALIST}.md and REVIEW-{SPECIALIST}-{PERSPECTIVE}.md patterns) directly via view tool and produces REVIEW-SYNTHESIS.md. The orchestrator sees only the final synthesis output.

The synthesis agent operates as a PR triage lead with these structural constraints:

May only merge, deduplicate, classify conflicts, and flag trade-offs
Must NOT generate new findings — it is not an additional reviewer
Must link every output claim to a specific specialist finding with evidence
Must randomize specialist input ordering to prevent position bias

Synthesis requirements:

Cluster by code location before merging — group findings by shared file + line range across specialists
Classify disagreements: factual dispute (one is wrong — resolve with evidence and rebuttal conditions) vs. trade-off (different quality objectives, both valid — escalate or flag)
Validate grounding for every finding: Direct (cites specific diff lines — full inclusion), Inferential (anchored in diff but requires reasoning — include with chain shown), Contextual (beyond the diff — demote to observations)
Confidence-weighted aggregation: weigh by stated confidence AND evidence quality, not count of specialists
Merge agreements, preserve dissent: compatible claims → single finding with combined evidence; unresolved disagreements → include both positions with "unresolved" flag
Proportional output: weight by evidence quality, not word count or finding count
Trade-off handling: in interactive/smart mode, escalate to user with shared facts, decision axis, options, and recommendation per priority hierarchy. In auto mode, apply priority hierarchy (Correctness > Security > Reliability > Performance > Maintainability > Developer Experience), document the decision, flag prominently

Perspective-aware synthesis (when perspectives are active):

Preserve **Perspective** attribution from specialist findings — each finding in the synthesis includes the perspective field from the source
When merging findings from different perspectives on the same specialist, treat perspective-attributed findings as distinct positions even if they address the same code location
Parallel mode conflict handling: Inter-perspective conflicts on the same specialist are surfaced with both positions and their perspective context, flagged as "high-signal perspective disagreement" in the Dissent Log
Debate mode conflict handling: Perspective-variant findings participate in debate as distinct positions; when synthesizing round summaries, explicitly contrast findings across different perspectives from the same specialist
Findings attributed to baseline were produced using the built-in baseline perspective (no evaluative frame shift)

REVIEW-SYNTHESIS.md structure:

The Finding sections (Must-Fix, Should-Fix, Consider) are the actionable items presented during interactive resolution. Trade-offs are resolved as part of findings when the user makes a decision. Observations, Dissent Log, Debate Trace, and Synthesis Trace are reference sections — they document the review process but are not presented as individual resolution items.

# REVIEW-SYNTHESIS.md

## Review Summary
- Mode: society-of-thought (parallel | debate)
- Specialists: [list of participating specialists]
- Context: [context value used for filtering, or "none"]
- Perspectives: [list of perspectives applied, or "none"]
- Perspective cap: [configured cap value]
- Selection rationale: [if adaptive mode was used]
- Rounds: [number of rounds completed, debate mode only]
- Threads: [total threads, agreed/contested/trade-off/resolved counts, debate mode only]

## Perspective Diversity
- Perspectives applied: [perspective-name → specialist list, for each active perspective]
- Selection mode: [auto | guided | none]
- Selection rationale: [if auto, why these perspectives were chosen for this artifact]
- Perspectives skipped: [if any, name + reason (cap, relevance threshold)]

## Must-Fix Findings
[Findings with severity: must-fix, each with specialist attribution, confidence, grounding tier]

## Should-Fix Findings
[Findings with severity: should-fix]

## Consider
[Findings with severity: consider]

## Trade-offs Requiring Decision
[Unresolved trade-offs — presented to user during finding resolution in interactive/smart mode, resolved via priority hierarchy in auto mode. Once resolved, the decision is recorded in the corresponding finding.]

## Observations
[Contextual-tier findings — beyond this diff, not presented during resolution. Retained as reference for future work.]

## Dissent Log
[Findings where specialists disagreed and how the disagreement was resolved. Includes inter-perspective disagreements (same specialist, different perspectives) flagged as high-signal perspective conflicts.]

## Debate Trace
[Debate mode only — thread-by-thread progression]
For each thread:
- Thread ID and initial finding
- Round-by-round responses from participating specialists
- Thread state at completion (agreed/contested/trade-off/resolved)
- Resolution method (evidence, user decision, conservative default, or budget exhaustion)

## Synthesis Trace
[For each finding: source specialist(s), grounding tier, conflict resolution method if any]

Output Artifacts

Create .gitignore with content * in the output directory (if not already present). This is a local-only scratch ignore marker — do NOT stage or commit it.

Review Type	Files Created
All modes, no perspectives	`REVIEW-{SPECIALIST-NAME}.md` per specialist
All modes, with perspectives	`REVIEW-{SPECIALIST-NAME}-{PERSPECTIVE}.md` per specialist-perspective pair
All modes	`REVIEW-SYNTHESIS.md`

Location: review context's output_dir.

Moderator Mode

Invocation: Moderator mode is a separate invocation from orchestration+synthesis. The calling skill invokes paw-sot once for orchestration and synthesis, handles its own resolution flow (apply/skip/discuss), then invokes paw-sot again for moderator mode if conditions are met. The calling skill provides the review context type, output_dir (containing individual specialist reviews and REVIEW-SYNTHESIS.md), and review coordinates.

After the calling skill completes finding resolution, moderator mode enables deeper interactive engagement with specialist personas.

Announce moderator mode with a brief prompt: available specialists, interaction options, and how to exit.

Interaction patterns:

Summon specialist: User references a specialist by name (e.g., "ask the security specialist about the auth flow"). Load the specialist's persona file, compose prompt with persona + shared rules + context preamble + review coordinates + REVIEW-SYNTHESIS.md context, and spawn a subagent that responds in-character using its cognitive strategy.
Challenge finding: User disagrees with a finding and provides reasoning. Spawn the originating specialist as a subagent with its full persona, the challenged finding, and the user's counter-argument. The specialist must respond with independent evidence — anti-sycophancy rules require it to either defend with new evidence or concede with specific reasoning for why the user's argument changes the assessment.
Request deeper analysis: User asks for focused analysis on a specific code area. Select the most relevant specialist (or user-specified) and spawn as subagent with focused scope.

Exit: User says "done", "continue", or "proceed" to exit moderator mode.

Skip condition: If interactive is false, no findings exist, or interactive is smart with no significant findings (only consider-tier and none qualify as quick wins), skip moderator mode entirely.

name	paw-sot
description	Society of Thought (SoT) — multi-perspective deliberative review engine. Orchestrates specialist personas with parallel or debate interaction, confidence-weighted synthesis, and interactive moderator mode. Use for code review, plan review, design analysis, or any scenario requiring diverse expert perspectives.

Society of Thought Engine

Execution Context: This skill is loaded into the calling agent's session (not spawned as a subagent), preserving user interactivity for trade-off decisions and moderator mode.

Capabilities

Specialist discovery with 4-level precedence (workflow → project → user → built-in)
Adaptive specialist selection based on content analysis
Parallel and debate interaction modes
Context-adaptive framing for code, artifact, or freeform review targets
Confidence-weighted synthesis with grounding validation
Post-synthesis moderator mode for deeper specialist engagement

Review Context Input Contract

The calling skill provides a review context describing what to review and how to orchestrate:

Field	Required	Values	Description
`type`	yes	`diff` \| `artifacts` \| `freeform`	What kind of content is being reviewed
`coordinates`	yes	diff range, artifact paths, or content description	What to point specialists at
`output_dir`	yes	directory path	Where to write REVIEW-*.md files
`specialists`	no	`all` (default) \| comma-separated names \| `adaptive:<N>`	Which specialists to invoke
`context`	no	string (type-dependent default — see Context Filtering)	Domain filter for specialists (case-insensitive match)
`interaction_mode`	yes	`parallel` \| `debate`	How specialists interact
`interactive`	yes	`true` \| `false` \| `smart`	Whether to pause for user decisions on trade-offs
`specialist_models`	no	`none` \| model pool \| pinned pairs \| mixed	Model assignment for specialists
`framing`	no	free text	Caller-provided preamble for `freeform` type
`perspectives`	no	`none` \| `auto` (default) \| comma-separated names	Which perspective overlays to apply
`perspective_cap`	no	positive integer (default `2`)	Max perspective overlays per specialist

Context-Adaptive Preambles

Before composing specialist prompts, inject a type-dependent preamble that frames how the specialist should interpret its cognitive strategy and domain for the review target.

Type diff (code/implementation review):

You are reviewing implementation changes (code diff). Apply your cognitive strategy and domain expertise to the actual code changes. Look for bugs, security issues, performance problems, pattern violations, and correctness gaps in the diff. Cite specific file paths and line numbers.

Type artifacts (design/planning review):

You are reviewing design and planning documents, not code. Apply your cognitive strategy to design decisions, architectural choices, and unstated assumptions. Your domain expertise should evaluate whether the planned approach is sound — look for gaps in reasoning, missing edge cases, feasibility risks, and cross-concern conflicts. Reference specific sections and claims in the documents rather than code lines.

Type freeform:

Use the caller-provided framing field content as the preamble. If no framing provided, use a neutral framing: "You are reviewing the provided content. Apply your cognitive strategy and domain expertise to identify issues, risks, and opportunities for improvement."

Specialist Discovery

Discover specialist personas at 4 precedence levels (most-specific-wins for name conflicts):

Workflow: Parse specialists from the review context — if an explicit comma-separated list, resolve only those names
Project: Scan .paw/personas/<name>.md files in the repository
User: Scan ~/.paw/personas/<name>.md files
Built-in: Scan references/specialists/<name>.md files (excluding _shared-rules.md)

Resolution rules:

If specialists is all (default): include all discovered specialists from all levels
If a fixed list (e.g., security, performance, testing): resolve each name against discovered specialists, most-specific-wins
If adaptive:<N>: discover all, then select N most relevant (see Adaptive Selection below)
Same filename at project level overrides user level overrides built-in
Skip malformed or empty specialist files with a warning; continue with remaining roster
If zero specialists found after discovery, fall back to built-in defaults with a warning

Context Filtering

The context field (domain filter) is distinct from the review context input structure.

Specialist context declaration: Specialists declare their domain via a context field in YAML frontmatter:

---
context: implementation
---

Matching behavior:

Each specialist declares a single context value (comma-separated or array values are not supported)
Case-insensitive string comparison (e.g., Business matches business)

Default behaviors:

Specialists without a context field, or with an empty/whitespace-only context value, default to implementation
Callers without context field:
- diff and artifacts types → default to implementation
- freeform type → no filtering (all specialists participate)

Zero-match handling: If context filtering results in zero matching specialists:

Interactive mode (interactive: true): Warn and prompt user — options: proceed with all specialists, specify different context, or abort
Non-interactive mode (interactive: false): Warn and proceed with all specialists as fallback
Smart mode (interactive: smart): Warn and prompt user (escalate on zero-match — context misconfiguration is more likely a setup error than a soft signal, warranting user confirmation)

Adaptive Selection

When specialists is adaptive:<N>, select the N most relevant specialists from the discovered roster (after any context filtering) based on content analysis.

Selection process:

Analyze the review target to identify dominant change categories — file types, affected subsystems, nature of changes (new logic, refactoring, config, API surface, data handling, test coverage)
For each discovered specialist, assess relevance by matching the specialist's cognitive strategy and domain against the identified change categories
Select up to N specialists with the highest relevance to the actual changes
Document selection rationale in the REVIEW-SYNTHESIS.md Selection rationale field (e.g., "Selected security, performance, testing — diff adds new API endpoint with database queries and no test coverage")

Edge cases:

If N ≥ number of available specialists, include all (equivalent to all)
If the review target is trivial (e.g., single typo fix, comment-only changes), report the condition to the calling context — the caller decides whether to proceed or fall back to a simpler review mode
If adaptive selection would select 0 specialists (no strong relevance signal):
- Interactive mode (interactive: true): Present the user with options — proceed with all specialists, specify which to include, or report the condition to the calling context for fallback
- Non-interactive mode (interactive: false or smart): Report the condition to the calling context — the caller decides how to proceed

Compatibility: Adaptive selection is orthogonal to interaction mode — works with both parallel and debate.

Perspective Discovery

When perspectives is not none, discover perspective overlay files at 4 precedence levels (mirroring specialist discovery):

Workflow: Parse perspectives from review context — if explicit comma-separated list, resolve only those names
Project: Scan .paw/perspectives/<name>.md files in the repository
User: Scan ~/.paw/perspectives/<name>.md files
Built-in: Scan references/perspectives/<name>.md files

Resolution rules (parallel to specialist discovery):

Most-specific-wins for name conflicts (project overrides user overrides built-in)
Skip malformed perspective files (missing required sections) with a warning; continue with remaining roster
Unknown names: if a name in the comma-separated list matches no discoverable perspective file, warn and skip — consistent with specialist discovery's malformed-file handling
none → skip discovery entirely; engine behaves identically to current implementation with no perspective-related processing
auto → discover all perspectives, then select based on artifact signals (see Perspective Auto-Selection)

Perspective Auto-Selection

When perspectives is auto, analyze the review target and select up to perspective_cap perspectives per specialist:

Analyze artifact type, file types touched, affected subsystems, and estimated complexity
Temporal perspectives (premortem, retrospective): relevant for code changes with operational implications, scaling concerns, dependency management, or long-term maintenance impact
Adversarial perspectives (red-team): relevant for changes touching trust boundaries, external interfaces, authentication, authorization, or data handling
If no perspectives meet relevance threshold: select the built-in baseline perspective and note in synthesis

Selection is budget-aware: total subagent calls = specialists × assigned perspectives. The perspective_cap bounds per-specialist perspective count.

Prompt Composition

Compose the review prompt for each specialist subagent from up to five layers:

Shared rules — load references/specialists/_shared-rules.md once per review run (anti-sycophancy rules, confidence scoring, Toulmin output format)
Context preamble — inject the type-dependent preamble (see Context-Adaptive Preambles above) to frame the specialist's review lens
Specialist content — load the discovered specialist .md file (identity, cognitive strategy, behavioral rules, demand rationale, examples)
Perspective overlay — when a perspective is assigned, load the perspective file, resolve {specialist} placeholder with the specialist's name, and inject as evaluative lens framing. The overlay template includes the **Perspective** field instruction for Toulmin output attribution. When no perspective is assigned (perspectives: none), skip this layer entirely — composition is identical to the pre-change 4-layer model.
Review coordinates — review target location (diff range, artifact paths, or content description) and output directory so the subagent can self-gather context via git diff, view, and grep

If a specialist file contains shared_rules_included: true in its YAML frontmatter, skip shared rules injection to avoid duplication. Otherwise, always inject shared rules.

Replace [specialist-name] in the shared rules output format with the specialist's actual name (e.g., security, performance).

Execution

Execution depends on interaction_mode:

Parallel Mode (default)

Spawn parallel subagents using task tool with agent_type: "general-purpose". For each specialist (and each assigned perspective):

Compose prompt: shared rules + context preamble + specialist content + perspective overlay (if assigned) + review coordinates
Resolve model using precedence chain (see Model Assignment below)
Instruct the subagent to write its Toulmin-structured findings directly to the output directory:
- With perspective: REVIEW-{SPECIALIST-NAME}-{PERSPECTIVE}.md
- Without perspective (perspectives: none): REVIEW-{SPECIALIST-NAME}.md (backward compatible)
Each subagent's findings are attributed to the perspective that framed its review
The orchestrator receives only a brief completion status (success/failure, finding count) — NOT the full findings content

When perspectives are active, each specialist runs once per assigned perspective. With 2 perspectives selected, a specialist runs twice (once per perspective), producing separate review files.

Model Assignment: Resolve the model for each specialist using this precedence (most-specific-wins):

Specialist frontmatter: If the specialist .md file contains a model: field in YAML frontmatter, use that model
Pinned assignment: If specialist_models contains a specialist:model pair matching this specialist name, use the pinned model
Model pool: If specialist_models contains unpinned model names, distribute them round-robin across unpinned specialists (sort specialists alphabetically, cycle through pool list)
Session default: If none of the above apply, use the session's default model

If a specified model is unavailable or invalid, fall back to session default with a user-visible warning.

Log the specialist→model assignment map at review start so users can verify the distribution.

After all specialists complete, proceed to synthesis.

Debate Mode

Thread-based multi-round debate where findings become discussion threads with point/counterpoint exchanges. Produces richer evidence than parallel mode at the cost of more subagent calls.

Edge case: If only 1 specialist is selected, skip debate and use parallel mode (debate requires ≥2 specialists).

Rounds 2–3 (Threaded responses): After each round, the synthesis agent (operating as PR triage lead) generates a round summary organized by thread:

For each thread: current state (open, agreed, contested), summary of positions, open questions
When perspectives are active, explicitly contrast findings across different perspectives from the same specialist to ensure the debate loop addresses intra-specialist perspective disagreements
This summary is the only inter-specialist communication (hub-and-spoke — specialists never see each other's raw findings)

Refine their position on existing threads (cite thread ID)
Respond to summarized counterarguments
Add new threads (new findings become new open threads)

Each specialist appends round output to its existing review file (REVIEW-{SPECIALIST-NAME}.md or REVIEW-{SPECIALIST-NAME}-{PERSPECTIVE}.md when perspectives are active).

Per-thread continuation (contested threads only): After global rounds close, threads still marked contested enter targeted continuation:

Only the specialists involved in the contested thread participate (2–3 specialists, not the full roster)
Max 2 additional exchanges per thread (total cap: 5 exchanges including global rounds)
Aggregate budget: Max 30 subagent calls across all continuation threads
Synthesis agent monitors each thread for convergence, deadlock, or trade-off identification
Trade-off detection: If a contested thread represents a genuine design trade-off (not a factual dispute), classify as trade-off and exit continuation — flag for user decision (interactive) or conservative default resolution (auto)

Thread states: open → agreed | contested | trade-off | resolved

After all threads resolve or budget is exhausted, proceed to synthesis.

Synthesis

The synthesis agent operates as a PR triage lead with these structural constraints:

May only merge, deduplicate, classify conflicts, and flag trade-offs
Must NOT generate new findings — it is not an additional reviewer
Must link every output claim to a specific specialist finding with evidence
Must randomize specialist input ordering to prevent position bias

Synthesis requirements:

Cluster by code location before merging — group findings by shared file + line range across specialists
Classify disagreements: factual dispute (one is wrong — resolve with evidence and rebuttal conditions) vs. trade-off (different quality objectives, both valid — escalate or flag)
Validate grounding for every finding: Direct (cites specific diff lines — full inclusion), Inferential (anchored in diff but requires reasoning — include with chain shown), Contextual (beyond the diff — demote to observations)
Confidence-weighted aggregation: weigh by stated confidence AND evidence quality, not count of specialists
Merge agreements, preserve dissent: compatible claims → single finding with combined evidence; unresolved disagreements → include both positions with "unresolved" flag
Proportional output: weight by evidence quality, not word count or finding count
Trade-off handling: in interactive/smart mode, escalate to user with shared facts, decision axis, options, and recommendation per priority hierarchy. In auto mode, apply priority hierarchy (Correctness > Security > Reliability > Performance > Maintainability > Developer Experience), document the decision, flag prominently

Perspective-aware synthesis (when perspectives are active):

Preserve **Perspective** attribution from specialist findings — each finding in the synthesis includes the perspective field from the source
When merging findings from different perspectives on the same specialist, treat perspective-attributed findings as distinct positions even if they address the same code location
Parallel mode conflict handling: Inter-perspective conflicts on the same specialist are surfaced with both positions and their perspective context, flagged as "high-signal perspective disagreement" in the Dissent Log
Debate mode conflict handling: Perspective-variant findings participate in debate as distinct positions; when synthesizing round summaries, explicitly contrast findings across different perspectives from the same specialist
Findings attributed to baseline were produced using the built-in baseline perspective (no evaluative frame shift)

REVIEW-SYNTHESIS.md structure:

# REVIEW-SYNTHESIS.md

## Review Summary
- Mode: society-of-thought (parallel | debate)
- Specialists: [list of participating specialists]
- Context: [context value used for filtering, or "none"]
- Perspectives: [list of perspectives applied, or "none"]
- Perspective cap: [configured cap value]
- Selection rationale: [if adaptive mode was used]
- Rounds: [number of rounds completed, debate mode only]
- Threads: [total threads, agreed/contested/trade-off/resolved counts, debate mode only]

## Perspective Diversity
- Perspectives applied: [perspective-name → specialist list, for each active perspective]
- Selection mode: [auto | guided | none]
- Selection rationale: [if auto, why these perspectives were chosen for this artifact]
- Perspectives skipped: [if any, name + reason (cap, relevance threshold)]

## Must-Fix Findings
[Findings with severity: must-fix, each with specialist attribution, confidence, grounding tier]

## Should-Fix Findings
[Findings with severity: should-fix]

## Consider
[Findings with severity: consider]

## Trade-offs Requiring Decision
[Unresolved trade-offs — presented to user during finding resolution in interactive/smart mode, resolved via priority hierarchy in auto mode. Once resolved, the decision is recorded in the corresponding finding.]

## Observations
[Contextual-tier findings — beyond this diff, not presented during resolution. Retained as reference for future work.]

## Dissent Log
[Findings where specialists disagreed and how the disagreement was resolved. Includes inter-perspective disagreements (same specialist, different perspectives) flagged as high-signal perspective conflicts.]

## Debate Trace
[Debate mode only — thread-by-thread progression]
For each thread:
- Thread ID and initial finding
- Round-by-round responses from participating specialists
- Thread state at completion (agreed/contested/trade-off/resolved)
- Resolution method (evidence, user decision, conservative default, or budget exhaustion)

## Synthesis Trace
[For each finding: source specialist(s), grounding tier, conflict resolution method if any]

Output Artifacts

Create .gitignore with content * in the output directory (if not already present). This is a local-only scratch ignore marker — do NOT stage or commit it.

Review Type	Files Created
All modes, no perspectives	`REVIEW-{SPECIALIST-NAME}.md` per specialist
All modes, with perspectives	`REVIEW-{SPECIALIST-NAME}-{PERSPECTIVE}.md` per specialist-perspective pair
All modes	`REVIEW-SYNTHESIS.md`

Location: review context's output_dir.

Moderator Mode

Invocation: Moderator mode is a separate invocation from orchestration+synthesis. The calling skill invokes paw-sot once for orchestration and synthesis, handles its own resolution flow (apply/skip/discuss), then invokes paw-sot again for moderator mode if conditions are met. The calling skill provides the review context type, output_dir (containing individual specialist reviews and REVIEW-SYNTHESIS.md), and review coordinates.

After the calling skill completes finding resolution, moderator mode enables deeper interactive engagement with specialist personas.

Announce moderator mode with a brief prompt: available specialists, interaction options, and how to exit.

Interaction patterns:

Summon specialist: User references a specialist by name (e.g., "ask the security specialist about the auth flow"). Load the specialist's persona file, compose prompt with persona + shared rules + context preamble + review coordinates + REVIEW-SYNTHESIS.md context, and spawn a subagent that responds in-character using its cognitive strategy.
Challenge finding: User disagrees with a finding and provides reasoning. Spawn the originating specialist as a subagent with its full persona, the challenged finding, and the user's counter-argument. The specialist must respond with independent evidence — anti-sycophancy rules require it to either defend with new evidence or concede with specific reasoning for why the user's argument changes the assessment.
Request deeper analysis: User asks for focused analysis on a specific code area. Select the most relevant specialist (or user-specified) and spawn as subagent with focused scope.

Exit: User says "done", "continue", or "proceed" to exit moderator mode.

paw-sot

More from this repository

Society of Thought Engine

Capabilities

Review Context Input Contract

Context-Adaptive Preambles

Specialist Discovery

Context Filtering

Adaptive Selection

Perspective Discovery

Perspective Auto-Selection

Prompt Composition

Execution

Parallel Mode (default)

Debate Mode

Synthesis

Output Artifacts

Moderator Mode

Society of Thought Engine

Capabilities

Review Context Input Contract

Context-Adaptive Preambles

Specialist Discovery

Context Filtering

Adaptive Selection

Perspective Discovery

Perspective Auto-Selection

Prompt Composition

Execution

Parallel Mode (default)

Debate Mode

Synthesis

Output Artifacts

Moderator Mode

More from this repository