تشغيل أي مهارة في Manus بنقرة واحدة

ابدأ الآن

model-router

Model selection guidance — match task complexity to model capability

تشغيل في Manus

نظرة عامة

Model selection guidance — match task complexity to model capability

أمر التثبيت

npx skills add https://github.com/andrem-sec/psc-comet --skill model-router

انسخ والصق هذا الأمر في Claude Code لتثبيت المهارة

المصدر

andrem-sec/psc-comet

النجوم٤

التفرعات٠

آخر تحديث٢٠ أبريل ٢٠٢٦ في ١٣:٣٣

SKILL.md

readonly

name	model-router
description	Model selection guidance — match task complexity to model capability
version	0.1.0
level	1
triggers	["which model","model router","use haiku","use opus","cost-sensitive"]
context_files	[]
steps	[{"name":"Classify Task","description":"Identify the task category and complexity"},{"name":"Apply Routing Rules","description":"Match to model tier using the routing table"},{"name":"State Recommendation","description":"Name the model and the reason in one line"}]

Model Router Skill

Match the task to the right model. Using Opus on trivial tasks wastes money. Using Haiku on complex reasoning tasks wastes time and produces wrong answers.

What Claude Gets Wrong Without This Skill

Without routing guidance, every task defaults to the same model regardless of whether it needs that model's capability. The result is either unnecessary cost (heavy model on simple tasks) or degraded quality (light model on complex tasks).

Routing Table

Use Haiku (fast, low cost)

File retrieval and search
Simple data transformation
Boilerplate generation
Lookup and summarization of known content
Repetitive formatting tasks
Test data generation
Simple code scaffolding

Use Sonnet (balanced — default)

Most software engineering tasks
Code review and debugging
Planning and decomposition
Multi-step implementation
Technical writing
API integration
Standard refactoring

Use Opus (deep reasoning, highest capability)

Architectural decisions with significant tradeoffs
Complex debugging with multiple interacting causes
Security audit requiring deep analysis
Tasks requiring reasoning across very long contexts
Novel problem solving with no clear precedent
High-stakes decisions where error is costly
Cross-domain synthesis
Agent-type selection and multi-agent workflow design (Opus achieves ~83% vs Sonnet ~72% on tool modality selection -- source: Terminal Agents paper, 2604.00073)

Pro plan note: Opus requires Anthropic Max plan. If on Pro plan, Sonnet will still work for agent/tool selection but may default to suboptimal choices. Treat Opus as a soft recommendation, not a hard requirement, for this category.

Agent-Level Routing

Apply the same logic to agent model selection in frontmatter:

model: claude-haiku-4-5-20251001    # retrieval, simple execution
model: claude-sonnet-4-6             # standard work (default)
model: claude-opus-4-6              # deep reasoning, high stakes

Two-Stage Judge Escalation

Before delivering Sonnet output, scan for low-confidence signals. If any are present, escalate to Opus and re-run the same prompt before delivering:

Escalation signals:

Hedging language ("I think", "probably", "might", "I'm not sure", "it depends")
Multiple competing answers presented without a clear recommendation
Output that contradicts known session context or prior decisions
Explicit uncertainty: "I cannot determine" or "more information needed" on a task that should be answerable

On escalation:

Re-run with Opus
Deliver the Opus output
Note the escalation: "Sonnet was uncertain -- Opus used"

Do not escalate on tasks where Sonnet hedging is appropriate (open-ended questions, design tradeoffs with no clear winner). Escalate only when a concrete answer was expected and Sonnet failed to commit.

Fallback Model Strategy

When a model fails or produces poor output, escalate to the next tier:

Haiku → Sonnet: If Haiku produces incorrect or incomplete output Sonnet → Opus: If Sonnet gets stuck or produces low-confidence results Opus → Human: If Opus cannot solve the problem, surface to user

Document the fallback decision:

Primary model: Haiku (simple transformation)
Result: Failed - produced malformed JSON
Fallback model: Sonnet
Result: Success
Reason for escalation: Task complexity was misjudged, required parsing ambiguous input

This creates a trail showing why model selection changed mid-task.

15-Minute Unit Rule (Task Decomposition Guidance)

If a task cannot be completed in 15 minutes, it is not a unit — it is a task list.

Break it down:

Each unit should be independently verifiable (has a clear done condition)
Each unit should have a single dominant risk (not multiple unknowns)
Each unit should produce a testable artifact

Example of bad decomposition:

"Implement user authentication" (too large, multiple risks)

Example of good decomposition:

"Create user schema with password_hash field" (15 min, one risk: schema design)
"Write password hashing function with bcrypt" (15 min, one risk: crypto library)
"Add login endpoint that validates password" (15 min, one risk: validation logic)
"Add session token generation" (15 min, one risk: token security)
"Write integration test for full auth flow" (15 min, one risk: test setup)

Why 15 minutes: Matches human attention span, allows frequent checkpoints, makes progress measurable.

Model routing + 15-min rule: If a Sonnet task takes >15 minutes, either decompose further OR escalate to Opus.

Cost Awareness

Running 10 Haiku calls costs roughly the same as 1 Sonnet call. Running 10 Sonnet calls costs roughly the same as 1 Opus call.

For batch operations on simple tasks, always use Haiku. For orchestrator-level decisions in agent teams, consider Opus. For most interactive work, Sonnet is correct.

Orchestrator-Level Routing (Multi-Agent Workflows)

In multi-agent workflows, model assignment is per agent role, not per session. The session default does not override an agent's role-based assignment.

Rule: Assign models in the agent's spawn prompt or YAML frontmatter, not in the orchestrator's session config.

Architect agent  → Opus   (even if session default is Sonnet)
Implementer agent → Sonnet
Verifier agent   → Sonnet
Classifier agent → Haiku

Reference agentic-engineering skill for the full per-role table and rationale.

Anti-Patterns

Do not route to Opus because a task feels important. Route to Opus because it requires deep multi-step reasoning that Sonnet demonstrably gets wrong.

Do not route to Haiku because you want to save money on a task where quality matters.

Mandatory Checklist

Verify the task was classified before the model was selected
Verify the recommendation names both the model and the reason
Verify Opus is not the default recommendation — it should be the exception
Verify fallback model is documented if primary model failed
Verify tasks >15 minutes are flagged for decomposition or model escalation

المزيد من هذا المستودع

نفس المستودع

library-pipeline

andrem-sec/psc-comet

Convert PDF/EPUB library to Markdown and generate Obsidian MOC notes

2026-04-224

strategic-compact

andrem-sec/psc-comet

Hook-based compaction suggestions at logical task boundaries

2026-04-224

token-budget

andrem-sec/psc-comet

Context window management — track spend, decide when to compact, preserve state

2026-04-224

heartbeat

andrem-sec/psc-comet

Session-start orientation — loads context, surfaces learnings, confirms registry

2026-04-204

code-review

andrem-sec/psc-comet

Quality and semantic review — catches what automated tools miss

2026-04-204

consensus-plan

andrem-sec/psc-comet

Planner → Architect → Critic deliberation loop — produces a formally validated ADR

2026-04-204

المصدر

andrem-sec

andrem-sec/psc-comet

فتح مستودع GitHub عرض مستودعات المنشئ

أمر التثبيت

تنزيل

تشغيل في Manus

مفيد لـSOC

مطوّرو البرمجياتمهن الحاسوب والرياضيات15-1252L4

name	model-router
description	Model selection guidance — match task complexity to model capability
version	0.1.0
level	1
triggers	["which model","model router","use haiku","use opus","cost-sensitive"]
context_files	[]
steps	[{"name":"Classify Task","description":"Identify the task category and complexity"},{"name":"Apply Routing Rules","description":"Match to model tier using the routing table"},{"name":"State Recommendation","description":"Name the model and the reason in one line"}]

Model Router Skill

Match the task to the right model. Using Opus on trivial tasks wastes money. Using Haiku on complex reasoning tasks wastes time and produces wrong answers.

What Claude Gets Wrong Without This Skill

Routing Table

Use Haiku (fast, low cost)

File retrieval and search
Simple data transformation
Boilerplate generation
Lookup and summarization of known content
Repetitive formatting tasks
Test data generation
Simple code scaffolding

Use Sonnet (balanced — default)

Most software engineering tasks
Code review and debugging
Planning and decomposition
Multi-step implementation
Technical writing
API integration
Standard refactoring

Use Opus (deep reasoning, highest capability)

Architectural decisions with significant tradeoffs
Complex debugging with multiple interacting causes
Security audit requiring deep analysis
Tasks requiring reasoning across very long contexts
Novel problem solving with no clear precedent
High-stakes decisions where error is costly
Cross-domain synthesis
Agent-type selection and multi-agent workflow design (Opus achieves ~83% vs Sonnet ~72% on tool modality selection -- source: Terminal Agents paper, 2604.00073)

Agent-Level Routing

Apply the same logic to agent model selection in frontmatter:

model: claude-haiku-4-5-20251001    # retrieval, simple execution
model: claude-sonnet-4-6             # standard work (default)
model: claude-opus-4-6              # deep reasoning, high stakes

Two-Stage Judge Escalation

Before delivering Sonnet output, scan for low-confidence signals. If any are present, escalate to Opus and re-run the same prompt before delivering:

Escalation signals:

Hedging language ("I think", "probably", "might", "I'm not sure", "it depends")
Multiple competing answers presented without a clear recommendation
Output that contradicts known session context or prior decisions
Explicit uncertainty: "I cannot determine" or "more information needed" on a task that should be answerable

On escalation:

Re-run with Opus
Deliver the Opus output
Note the escalation: "Sonnet was uncertain -- Opus used"

Fallback Model Strategy

When a model fails or produces poor output, escalate to the next tier:

Document the fallback decision:

Primary model: Haiku (simple transformation)
Result: Failed - produced malformed JSON
Fallback model: Sonnet
Result: Success
Reason for escalation: Task complexity was misjudged, required parsing ambiguous input

This creates a trail showing why model selection changed mid-task.

15-Minute Unit Rule (Task Decomposition Guidance)

If a task cannot be completed in 15 minutes, it is not a unit — it is a task list.

Break it down:

Each unit should be independently verifiable (has a clear done condition)
Each unit should have a single dominant risk (not multiple unknowns)
Each unit should produce a testable artifact

Example of bad decomposition:

"Implement user authentication" (too large, multiple risks)

Example of good decomposition:

"Create user schema with password_hash field" (15 min, one risk: schema design)
"Write password hashing function with bcrypt" (15 min, one risk: crypto library)
"Add login endpoint that validates password" (15 min, one risk: validation logic)
"Add session token generation" (15 min, one risk: token security)
"Write integration test for full auth flow" (15 min, one risk: test setup)

Why 15 minutes: Matches human attention span, allows frequent checkpoints, makes progress measurable.

Model routing + 15-min rule: If a Sonnet task takes >15 minutes, either decompose further OR escalate to Opus.

Cost Awareness

Running 10 Haiku calls costs roughly the same as 1 Sonnet call. Running 10 Sonnet calls costs roughly the same as 1 Opus call.

For batch operations on simple tasks, always use Haiku. For orchestrator-level decisions in agent teams, consider Opus. For most interactive work, Sonnet is correct.

Orchestrator-Level Routing (Multi-Agent Workflows)

In multi-agent workflows, model assignment is per agent role, not per session. The session default does not override an agent's role-based assignment.

Rule: Assign models in the agent's spawn prompt or YAML frontmatter, not in the orchestrator's session config.

Architect agent  → Opus   (even if session default is Sonnet)
Implementer agent → Sonnet
Verifier agent   → Sonnet
Classifier agent → Haiku

Reference agentic-engineering skill for the full per-role table and rationale.

Anti-Patterns

Do not route to Opus because a task feels important. Route to Opus because it requires deep multi-step reasoning that Sonnet demonstrably gets wrong.

Do not route to Haiku because you want to save money on a task where quality matters.

Mandatory Checklist

Verify the task was classified before the model was selected
Verify the recommendation names both the model and the reason
Verify Opus is not the default recommendation — it should be the exception
Verify fallback model is documented if primary model failed
Verify tasks >15 minutes are flagged for decomposition or model escalation