with one click
llm-council
// [Decision Support] Use when pressure-testing irreversible, high-stakes decisions with adversarial AI advisors.
// [Decision Support] Use when pressure-testing irreversible, high-stakes decisions with adversarial AI advisors.
[HINT] Download the complete skill directory including SKILL.md and all related files
| name | llm-council |
| description | [Decision Support] Use when pressure-testing irreversible, high-stakes decisions with adversarial AI advisors. |
Codex compatibility note:
- Invoke repository skills with
$skill-namein Codex; this mirrored copy rewrites legacy Claude/skill-namereferences.- Prefer the
plan-hardskill for planning guidance in this Codex mirror.- Task tracker mandate: BEFORE executing any workflow or skill step, create/update task tracking for all steps and keep it synchronized as progress changes.
- User-question prompts mean to ask the user directly in Codex.
- Ignore Claude-specific mode-switch instructions when they appear.
- Strict execution contract: when a user explicitly invokes a skill, execute that skill protocol as written.
- Subagent authorization: when a skill is user-invoked or AI-detected and its protocol requires subagents, that skill activation authorizes use of the required
spawn_agentsubagent(s) for that task.- Do not skip, reorder, or merge protocol steps unless the user explicitly approves the deviation first.
- For workflow skills, execute each listed child-skill step explicitly and report step-by-step evidence.
- If a required step/tool cannot run in this environment, stop and ask the user before adapting.
Codex does not receive Claude hook-based doc injection. When coding, planning, debugging, testing, or reviewing, open project docs explicitly using this routing.
Always read:
docs/project-config.json (project-specific paths, commands, modules, and workflow/test settings)docs/project-reference/docs-index-reference.md (routes to the full docs/project-reference/* catalog)docs/project-reference/lessons.md (always-on guardrails and anti-patterns)Situation-based docs:
backend-patterns-reference.md, domain-entities-reference.md, project-structure-reference.mdfrontend-patterns-reference.md, scss-styling-guide.md, design-system/README.mdfeature-docs-reference.mdintegration-test-reference.mde2e-test-reference.mdcode-review-rules.md plus domain docs above based on changed filesDo not read all docs blindly. Start from docs-index-reference.md, then open only relevant files for the task.
[IMPORTANT] MUST ATTENTION use council only for multi-option, hard-to-reverse, high-stakes decisions. NEVER council trivial, factual, reversible, or single-option questions. [IMPORTANT] MUST ATTENTION spawn 5 advisors in parallel, then 5 fresh peer reviewers in parallel, then chairman synthesis. [IMPORTANT] MUST ATTENTION require evidence for code/architecture claims:
file:line, graph trace, or explicit "insufficient evidence."
Goal: Adversarial decision support for costly wrong choices. Five advisors analyze independently, five fresh reviewers critique anonymously, chairman produces final verdict + report artifacts.
Workflow: Gate → Frame → 5 parallel advisors → anonymized 5-reviewer peer review → chairman synthesis → paired HTML/MD reports.
Key Rules:
$why-review → $plan-validate → $llm-council..code-graph/graph.db exists..agents/skills/llm-council/SKILL.md.Run before advisors.
| Gate | Required | Route if false |
|---|---|---|
| Multi-option | >=2 viable paths | Single-option rationale → $why-review |
| Hard to reverse | Architecture, stack, pricing, hiring, irreversible refactor | Reversible choice → answer directly |
| Real stakes | Wrong call costs >=1 week, money, trust, or strategic position | Low stakes → answer directly |
| Multi-angle value | Contrarian/first-principles/upside/outside/execution views change answer | Factual/single-domain → answer directly |
If any gate fails, state failed gate + lighter route. NEVER council "should I use markdown."
Good prompts: pricing model, positioning angle, pivot, hiring vs automation, architecture bet, high-risk launch, irreversible refactor. Bad prompts: factual lookup, simple yes/no, content generation, summarization, bugfix, package upgrade, routine refactor, anything decidable with one grep.
Each advisor = thinking dimension, not persona costume. One strong angle each.
| Advisor | Think |
|---|---|
| Contrarian | What fails? What assumption kills plan? |
| First Principles Thinker | What problem are we solving? Which assumptions need rebuild? |
| Expansionist | What upside, adjacent opportunity, undervalued path is missing? |
| Outsider | What confuses someone with no context? What jargon hides value? |
| Executor | What can happen Monday morning? Fastest validated path? |
Tensions: Contrarian vs Expansionist, First Principles vs Executor, Outsider grounds both.
When user says "council this", enrich then frame.
Context discovery budget: <=30 seconds; read 2-3 high-signal files.
Search order: CLAUDE.md / AGENTS.md; docs/project-config.json; docs/project-reference/project-structure-reference.md; matching docs/project-reference/*{domain-entities,backend-patterns,frontend-patterns,code-review-rules}*; docs/specs/; memory/; user-referenced files; plans/reports/council-*; domain data (pricing -> revenue, architecture -> service map, tech stack -> dependencies).
Code/architecture gate: If question references existing code, services, files, or blast radius, run before framing:
python .claude/scripts/code_graph trace <key-file> --direction both --json
Skip graph only when .code-graph/graph.db missing or question is non-code.
Framed question includes: core decision, user context, workspace evidence, stakes, constraints, known unknowns. Do not add your opinion. Ask exactly one clarifying question only if prompt is too vague.
Spawn all 5 advisors simultaneously. Each gets identity, framed question, evidence rules, output constraints.
You are [Advisor Name] on an LLM Council.
Thinking style: [advisor dimension from table]
Question:
---
[framed question]
---
EVIDENCE RULES:
- Code/architecture claims require `file:line`, graph trace, or "I don't have enough evidence yet."
- If existing code context is needed, run:
python .claude/scripts/code_graph trace <file> --direction both --json
python .claude/scripts/code_graph connections <file> --json
- Cite trace output for blast radius, callers, downstream impact.
- Confidence: 95-100% full trace | 80-94% main paths | 60-79% partial | <60% do not recommend.
- Do NOT speculate. Name missing evidence instead.
Respond from assigned angle. Direct, specific, unbalanced by design. Other advisors cover other angles.
Length: 150-300 words plus one-line confidence declaration. No preamble.
Collect responses. Randomize/anonymize as Response A-E. Spawn 5 fresh reviewers in parallel.
You are reviewing an LLM Council.
Question:
---
[framed question]
---
Anonymized responses:
**Response A:** [response]
**Response B:** [response]
**Response C:** [response]
**Response D:** [response]
**Response E:** [response]
Answer:
1. Strongest response? Why?
2. Biggest blind spot? What is missing?
3. What did all five miss?
Reference responses by letter. Be specific. Under 200 words.
Chairman receives original question, framed question, de-anonymized advisor responses, peer reviews, anonymization map.
You are Chairman of an LLM Council. Synthesize 5 advisors + peer reviews into final verdict.
Question:
---
[framed question]
---
ADVISOR RESPONSES:
**Contrarian:** [response]
**First Principles Thinker:** [response]
**Expansionist:** [response]
**Outsider:** [response]
**Executor:** [response]
PEER REVIEWS:
[all peer reviews]
Produce exact structure:
## Where the Council Agrees
[Independent convergences; high-confidence signals.]
## Where the Council Clashes
[Genuine disagreements. Present both sides; explain why reasonable advisors disagree.]
## Blind Spots the Council Caught
[Only emerged through peer review.]
## The Recommendation
[Clear direct recommendation. Not "it depends."]
## The One Thing to Do First
[Single concrete next step.]
Be direct. Do not hedge. Give clarity unavailable from one perspective.
Quality loop: If chairman misses required sections, ignores degraded-quality state, or makes unsupported code claims, repair with fresh chairman prompt. Max 3 repair rounds; then escalate with missing evidence.
Write both files; create plans/reports/ if missing.
plans/reports/council-{YYMMDD-HHMM}-{kebab-slug}.html
plans/reports/council-{YYMMDD-HHMM}-{kebab-slug}.md
{YYMMDD-HHMM} = session datetime. {kebab-slug} = 3-6 word question descriptor. Both share prefix. NEVER write artifacts to workspace root or docs/.
HTML: self-contained inline CSS, question top, prominent chairman verdict, agreement/disagreement visual, collapsed advisor responses, collapsed peer review highlights, footer timestamp/topic, professional briefing style, open after generation.
Markdown transcript: original question, framed question, all advisor responses, all peer reviews, anonymization mapping, chairman synthesis.
Council complete.
Report: plans/reports/council-{YYMMDD-HHMM}-{kebab-slug}.html
Transcript: plans/reports/council-{YYMMDD-HHMM}-{kebab-slug}.md
Verdict: [1-2 sentence chairman recommendation]
First action: [single next step]
Opt-in escalation hook from host skills. NEVER wire into bugfix, refactor, migration, package-upgrade, or test-* workflows.
| Host skill | Mode | Default | Gate |
|---|---|---|---|
architecture-design | Always-offer after ## Next Steps | Skip | User chooses |
tech-stack-research | Always-offer after ## Next Steps | Skip | User chooses |
why-review | Conditional on active plan/PBI frontmatter | Escalate when gate fires | 8-OR gate |
prioritize | Conditional on ranking output | Escalate when gate fires | RICE top-2 within 15%, MoSCoW tie, or stakeholder disagreement |
why-review Gate SchemaGate fires when ANY field true. Absent fields default no-fire; gate opt-in via frontmatter, never opt-out.
| Field | Convention | Fires when |
|---|---|---|
cross_service_impact | NONE / PARTIAL / FULL | value != NONE |
breaking_changes | bool | true |
complexity | low / medium / high / critical | high, critical, or story_points >= 13 |
new_framework | bool | true |
irreversible | bool | true |
security_critical | bool | true |
performance_critical | bool | true |
cost_high | bool | true |
Host prompt copy MUST cite cheaper rungs: $why-review, $plan-validate, $llm-council.
IMPORTANT MUST ATTENTION use council only for multi-option, hard-to-reverse, high-stakes decisions.
IMPORTANT MUST ATTENTION spawn 5 advisors in parallel, then 5 fresh peer reviewers in parallel, then chairman synthesis.
IMPORTANT MUST ATTENTION require evidence for code/architecture claims: file:line, graph trace, or explicit "insufficient evidence."
IMPORTANT MUST ATTENTION mark verdict degraded if fewer than 5 usable advisor responses return.
IMPORTANT MUST ATTENTION write paired HTML + Markdown artifacts under plans/reports/ and open HTML.
IMPORTANT MUST ATTENTION sync every canonical edit to .agents/skills/llm-council/SKILL.md.
Anti-Rationalization:
| Evasion | Rebuttal |
|---|---|
| "This decision feels important" | Gate it: multi-option, hard-to-reverse, real stakes, multi-angle value. |
| "One advisor can handle it" | Council value comes from independent angles + anonymous peer review. |
| "Sequential spawn is simpler" | Sequential spawn contaminates independence. Parallel spawn required. |
| "Four advisors is close enough" | Missing angle changes verdict quality. Mark degraded. |
| "Evidence would slow us down" | Unsupported code/architecture claims are speculation. Use graph/file proof or say insufficient evidence. |
Source: .claude/hooks/lib/prompt-injections.cjs + .claude/.ck.json
$workflow-start <workflowId> for standard; sequence custom steps manually[CRITICAL] Hard-won project debugging/architecture rules. MUST ATTENTION apply BEFORE forming hypothesis or writing code.
Goal: Prevent recurrence of known failure patterns — debugging, architecture, naming, AI orchestration, environment.
Top Rules (apply always):
ExecuteInjectScopedAsync for parallel async + repo/UoW — NEVER ExecuteUowTaskwhere python/where py) — NEVER assume python/python3 resolvesExecuteInjectScopedAsync, NEVER ExecuteUowTask. ExecuteUowTask creates new UoW but reuses outer DI scope (same DbContext) — parallel iterations sharing non-thread-safe DbContext silently corrupt data. ExecuteInjectScopedAsync creates new UoW + new DI scope (fresh repo per iteration).AccountUserEntityEventBusMessage = Accounts owns). Core services (Accounts, Communication) are leaders. Feature services (Growth, Talents) sending to core MUST use {CoreServiceName}...RequestBusMessage — never define own event for core to consume.HrManagerOrHrOrPayrollHrOperationsPolicy names set members, not what it guards. Add role → rename = broken abstraction. Rule: names express DOES/GUARDS, not CONTAINS. Test: adding/removing member forces rename? YES = content-driven = bad → rename to purpose (e.g., HrOperationsAccessPolicy). Nuance: "Or" fine in behavioral idioms (FirstOrDefault, SuccessOrThrow) — expresses HAPPENS, not membership.python/python3 resolves — verify alias first. Python may not be in bash PATH under those names. Check: where python / where py. Prefer py (Windows Python Launcher) for one-liners, node if JS alternative exists.Test-specific lessons →
docs/project-reference/integration-test-reference.mdLessons Learned section. Production-code anti-patterns →docs/project-reference/backend-patterns-reference.mdAnti-Patterns section. Generic debugging/refactoring reminders → System Lessons in.claude/hooks/lib/prompt-injections.cjs.
ExecuteInjectScopedAsync, NEVER ExecuteUowTask (shared DbContext = silent data corruption){CoreServiceName}...RequestBusMessagepython/python3 resolves — run where python/where py first, use py launcher or nodeBreak work into small tasks (task tracking) before starting. Add final task: "Analyze AI mistakes & lessons learned".
Extract lessons — ROOT CAUSE ONLY, not symptom fixes:
$learn.$code-review/$code-simplifier/$security/$lint catch this?" — Yes → improve review skill instead.$learn.
[TASK-PLANNING] [MANDATORY] BEFORE executing any workflow or skill step, create/update task tracking for all planned steps, then keep it synchronized as each step starts/completes.