Run any Skill in Manus with one click

$pwd:

assess

Name: Assess
Author: yonatangross

// Assesses and rates quality 0-10 across multiple dimensions (correctness, maintainability, security, performance, testability, simplicity) with pros/cons analysis. Compares against project conventions and prior decisions from memory. Produces structured evaluation reports with actionable improvement suggestions. Use when evaluating code, designs, architectures, or comparing alternative approaches.

Run Skill in Manus

$ git log --oneline --stat

stars:169

forks:15

updated:April 28, 2026 at 10:45

File Explorer

21 files

SKILL.md

readonly

package.json

"author": "yonatangross"

"repository": "yonatangross/orchestkit"

View GitHub Repository

$ install --globalskills.sh

$ download --local

Run Skill in Manus

[HINT] Download the complete skill directory including SKILL.md and all related files

Run any Skill with one click

name	assess
license	MIT
compatibility	Claude Code 2.1.76+. Requires memory MCP server.
description	Assesses and rates quality 0-10 across multiple dimensions (correctness, maintainability, security, performance, testability, simplicity) with pros/cons analysis. Compares against project conventions and prior decisions from memory. Produces structured evaluation reports with actionable improvement suggestions. Use when evaluating code, designs, architectures, or comparing alternative approaches.
context	fork
version	1.7.0
author	OrchestKit
tags	["assessment","evaluation","quality","comparison","pros-cons","rating"]
user-invocable	true
allowed-tools	["AskUserQuestion","Read","Grep","Glob","Task","TaskCreate","TaskUpdate","TaskList","ToolSearch","mcp__memory__search_nodes","Bash"]
skills	["code-review-playbook","quality-gates","architecture-decision-record","memory","chain-patterns"]
argument-hint	[code-path-or-topic] [--render=markdown\|json-render\|both] [--effort=low\|medium\|high\|xhigh]
complexity	high
persuasion-type	guidance
effort	high
model	sonnet
hooks	{"PreToolUse":[{"matcher":"Read","command":"${CLAUDE_PLUGIN_ROOT}/hooks/bin/run-hook.mjs skill/assessment-baseline-loader","once":true}]}
metadata	{"category":"document-asset-creation","mcp-server":"memory"}
triggers	{"keywords":["assess","asses","rate","evaluate","grade","score","compare","how good","how bad","red flags","trade-offs","pros and cons","good enough"],"examples":["rate this code from 0 to 10","is this approach good enough for production?","evaluate the trade-offs between Redis vs Postgres"],"anti-triggers":["fix","implement","build","test","commit","review pr","explore"]}

Assess

Comprehensive assessment skill for answering "is this good?" with structured evaluation, scoring, and actionable recommendations.

Quick Start

/ork:assess backend/app/services/auth.py
/ork:assess our caching strategy
/ork:assess --model=opus the current database schema
/ork:assess frontend/src/components/Dashboard

Effort levels (CC 2.1.111+ adds `xhigh`)

Effort	Behavior
`low` / `medium`	Subset of dimensions, faster turnaround
`high` (default)	All six dimensions with pros/cons
`xhigh` (Opus 4.7 only)	All six dimensions + one additional assessor pass focused on uncertainty/caveats; emits `confidence` per dimension

xhigh silently falls back to high on non-Opus-4.7 models. /ork:doctor warns when xhigh is used without Opus 4.7.

Argument Resolution

TARGET = "$ARGUMENTS"  # Full argument string, e.g., "backend/app/services/auth.py"
# $ARGUMENTS[0] is the first token (CC 2.1.59 indexed access)

# Model override detection (CC 2.1.72)
MODEL_OVERRIDE = None
for token in "$ARGUMENTS".split():
    if token.startswith("--model="):
        MODEL_OVERRIDE = token.split("=", 1)[1]  # "opus", "sonnet", "haiku"
        TARGET = TARGET.replace(token, "").strip()

Pass MODEL_OVERRIDE to all Agent() calls via model=MODEL_OVERRIDE when set. Accepts symbolic names (opus, sonnet, haiku) or full IDs (claude-opus-4-6) per CC 2.1.74.

Effort detection (CC 2.1.120+)

${CLAUDE_EFFORT} is the primary signal. CC 2.1.120 sets this env var from /effort or the model picker. --effort= token in $ARGUMENTS is the explicit override fallback (also covers older CC).

# Read env first (CC 2.1.120+), then check explicit override
EFFORT = os.environ.get("CLAUDE_EFFORT")  # "low" | "medium" | "high" | "xhigh" | None
for token in "$ARGUMENTS".split():
    if token.startswith("--effort="):
        EFFORT = token.split("=", 1)[1]   # explicit override wins
        TARGET = TARGET.replace(token, "").strip()
EFFORT = EFFORT or "high"  # default when CC < 2.1.120 and no flag

Use EFFORT to gate dimension count, agent count, and the optional xhigh uncertainty pass — see "Effort levels" table above. On CC < 2.1.120 the env var is unset; the explicit --effort= override is the only path. /ork:doctor warns when xhigh is requested without Opus 4.7.

STEP -1: MCP Probe + Resume Check

Load: Read("${CLAUDE_PLUGIN_ROOT}/skills/chain-patterns/references/mcp-detection.md")

# 1. Probe MCP servers (once at skill start)
# memory is alwaysLoad in .mcp.json (CC 2.1.121+, #1541) — probe below kept as fallback for older CC:
ToolSearch(query="select:mcp__memory__search_nodes")

# 2. Store capabilities
Write(".claude/chain/capabilities.json", {
  "memory": probe_memory.found,
  "skill": "assess",
  "timestamp": now()
})

# 3. Check for resume
state = Read(".claude/chain/state.json")  # may not exist
if state.skill == "assess" and state.status == "in_progress":
    last_handoff = Read(f".claude/chain/{state.last_handoff}")

Phase Handoffs

Phase	Handoff File	Contents
0	`00-intent.json`	Dimensions, target, mode
1	`01-baseline.json`	Initial codebase scan results
2	`02-evaluation.json`	Per-dimension scores + evidence
3	`03-report.json`	Final report, grade, recommendations

STEP 0: Verify User Intent with AskUserQuestion

BEFORE creating tasks, clarify assessment dimensions:

AskUserQuestion(
  questions=[{
    "question": "What dimensions to assess?",
    "header": "Dimensions",
    "options": [
      {"label": "Full assessment (Recommended)", "description": "All dimensions: quality, maintainability, security, performance", "markdown": "```\nFull Assessment (7 phases)\n──────────────────────────\n  Dimensions scored 0-10:\n  ┌─────────────────────────────┐\n  │ Correctness      ████████░░ │\n  │ Maintainability  ██████░░░░ │\n  │ Security         █████████░ │\n  │ Performance      ███████░░░ │\n  │ Testability      ██████░░░░ │\n  │ Architecture     ████████░░ │\n  │ Documentation    █████░░░░░ │\n  └─────────────────────────────┘\n  + Pros/cons + alternatives\n  + Effort estimates + report\n  Agents: 4 parallel evaluators\n```"},
      {"label": "Code quality only", "description": "Readability, complexity, best practices", "markdown": "```\nCode Quality Focus\n──────────────────\n  Dimensions scored 0-10:\n  ┌─────────────────────────────┐\n  │ Correctness      ████████░░ │\n  │ Maintainability  ██████░░░░ │\n  │ Testability      ██████░░░░ │\n  └─────────────────────────────┘\n  Skip: security, performance\n  Agents: 1 code-quality-reviewer\n  Output: Score + best practice gaps\n```"},
      {"label": "Security focus", "description": "Vulnerabilities, attack surface, compliance", "markdown": "```\nSecurity Focus\n──────────────\n  ┌──────────────────────────┐\n  │ OWASP Top 10 check       │\n  │ Dependency CVE scan       │\n  │ Auth/AuthZ flow review    │\n  │ Data flow tracing         │\n  │ Secrets detection         │\n  └──────────────────────────┘\n  Agent: security-auditor\n  Output: Vuln list + severity\n          + remediation steps\n```"},
      {"label": "Quick score", "description": "Just give me a 0-10 score with brief notes", "markdown": "```\nQuick Score\n───────────\n  Single pass, ~2 min:\n\n  Read target ──▶ Score ──▶ Done\n                  7.2/10\n\n  Output:\n  ├── Composite score (0-10)\n  ├── Grade (A-F)\n  ├── 3 strengths\n  └── 3 improvements\n  No agents, no deep analysis\n```"}
    ],
    "multiSelect": false
  }]
)

Based on answer, adjust workflow:

Full assessment: All 7 phases, parallel agents
Code quality only: Skip security and performance phases
Security focus: Prioritize security-auditor agent
Quick score: Single pass, brief output

STEP 0b: Select Orchestration Mode

Load details: Read("${CLAUDE_SKILL_DIR}/references/orchestration-mode.md") for env var check logic, Agent Teams vs Task Tool comparison, and mode selection rules.

Task Management (CC 2.1.16)

# 1. Create main task IMMEDIATELY
TaskCreate(
  subject="Assess: {target}",
  description="Comprehensive evaluation with quality scores and recommendations",
  activeForm="Assessing {target}"
)

# 2. Create subtasks for each assessment phase
TaskCreate(subject="Understand target and gather context", activeForm="Understanding target")   # id=2
TaskCreate(subject="Discover scope and build file list", activeForm="Discovering scope")        # id=3
TaskCreate(subject="Rate quality across 7 dimensions", activeForm="Rating quality")             # id=4
TaskCreate(subject="Analyze pros and cons", activeForm="Analyzing pros/cons")                   # id=5
TaskCreate(subject="Compare alternatives", activeForm="Comparing alternatives")                 # id=6
TaskCreate(subject="Generate improvement suggestions", activeForm="Generating suggestions")     # id=7
TaskCreate(subject="Compile assessment report", activeForm="Compiling report")                  # id=8

# 3. Set dependencies for sequential phases
TaskUpdate(taskId="3", addBlockedBy=["2"])  # Scope needs target understanding
TaskUpdate(taskId="4", addBlockedBy=["3"])  # Rating needs scoped file list
TaskUpdate(taskId="5", addBlockedBy=["4"])  # Pros/cons needs quality scores
TaskUpdate(taskId="6", addBlockedBy=["4"])  # Alternatives need quality scores
TaskUpdate(taskId="7", addBlockedBy=["5", "6"])  # Suggestions need analysis
TaskUpdate(taskId="8", addBlockedBy=["7"])  # Report needs suggestions

# 4. Before starting each task, verify it's unblocked
task = TaskGet(taskId="2")  # Verify blockedBy is empty

# 5. Update status as you progress
TaskUpdate(taskId="2", status="in_progress")  # When starting
TaskUpdate(taskId="2", status="completed")    # When done — repeat for each subtask

What This Skill Answers

Question	How It's Answered
"Is this good?"	Quality score 0-10 with reasoning
"What are the trade-offs?"	Structured pros/cons list
"Should we change this?"	Improvement suggestions with effort
"What are the alternatives?"	Comparison with scores
"Where should we focus?"	Prioritized recommendations

Workflow Overview

Phase	Activities	Output
1. Target Understanding	Read code/design, identify scope	Context summary
1.5. Scope Discovery	Build bounded file list	Scoped file list
2. Quality Rating	7-dimension scoring (0-10)	Scores with reasoning
3. Pros/Cons Analysis	Strengths and weaknesses	Balanced evaluation
4. Alternative Comparison	Score alternatives	Comparison matrix
5. Improvement Suggestions	Actionable recommendations	Prioritized list
6. Effort Estimation	Time and complexity estimates	Effort breakdown
7. Assessment Report	Compile findings	Final report

Phase 1: Target Understanding

Identify what's being assessed and gather context:

# PARALLEL - Gather context
Read(file_path="$ARGUMENTS[0]")  # If file path
Grep(pattern="$ARGUMENTS[0]", output_mode="files_with_matches")
mcp__memory__search_nodes(query="$ARGUMENTS[0]")  # Past decisions

Phase 1.5: Scope Discovery

Load Read("${CLAUDE_SKILL_DIR}/references/scope-discovery.md") for the full file discovery, limit application (MAX 30 files), and sampling priority logic. Always include the scoped file list in every agent prompt.

Progressive Output (CC 2.1.76)

Output results incrementally as each evaluation phase completes:

After Phase	Show User
1. Target Understanding	Scope summary, file list, context
1.5. Scope Discovery	Bounded file list (max 30 files)
2. Quality Rating	Each dimension's score as the evaluating agent returns
3. Pros/Cons	Balanced evaluation summary

For Phase 2 parallel agents, show each dimension's score as soon as the evaluating agent returns — don't wait for all 4 agents. If any dimension scores below 4/10, flag it immediately as a priority concern requiring user attention.

Phase 2: Quality Rating (7 Dimensions)

Rate each dimension 0-10 with weighted composite score. Load Read("${CLAUDE_PLUGIN_ROOT}/skills/quality-gates/references/unified-scoring-framework.md") for dimensions, weights, grade interpretation, and per-dimension criteria. Load Read("${CLAUDE_SKILL_DIR}/references/quality-model.md") for assess-specific overrides.

Load Read("${CLAUDE_SKILL_DIR}/references/agent-spawn-definitions.md") for Task Tool mode spawn patterns and Agent Teams alternative.

Composite Score: Weighted average of all 7 dimensions (see quality-model.md).

Phases 3-7: Analysis, Comparison & Report

Load Read("${CLAUDE_SKILL_DIR}/references/phase-templates.md") for output templates for pros/cons, alternatives, improvements, effort, and the final report.

See also: Read("${CLAUDE_SKILL_DIR}/references/alternative-analysis.md") | Read("${CLAUDE_SKILL_DIR}/references/improvement-prioritization.md")

Phase 7b: Emit Dashboard Spec (json-render)

Parse --render= from $ARGUMENTS. Default is both.

Mode	Behavior
`markdown`	Current behavior — markdown assessment report only. No spec emitted.
`json-render`	Emit `.claude/chain/assess-dashboard.json` only. Skip markdown report.
`both`	Emit spec and markdown. Default — human reads the report, downstream skills parse the spec.

When emitting a spec:

Load format and catalog: Read("${CLAUDE_SKILL_DIR}/references/dashboard-spec.md"). Example: references/dashboard-example.json.
Build the spec using only catalog types: Card, StatGrid, DataTable, StatusBadge, BarMeter, Markdown. Top-level fields composite (number) and grade (string) are required for assess specs.
One BarMeter per dimension scored. The verdict element is a StatusBadge with status success/warning/error mapped from grade (A/B → success, C → warning, D/F → error).
Write to .claude/chain/assess-dashboard.json with compact JSON.
Validate before declaring success:

node "${CLAUDE_SKILL_DIR}/scripts/render-spec.mjs" .claude/chain/assess-dashboard.json --check

If validation fails, fall back to markdown-only and surface the error. Never write a partial spec.

For --render=both, render the markdown view from the spec:

node "${CLAUDE_SKILL_DIR}/scripts/render-spec.mjs" .claude/chain/assess-dashboard.json

This guarantees JSON spec and markdown report stay in sync.

xhigh effort: when effort=xhigh is active, add a sibling Markdown element per dimension containing confidence and caveats from the uncertainty pass. Reference list it in the dimensions Card's children alongside the BarMeter. See references/dashboard-spec.md for the exact pattern.

Downstream consumption: /ork:implement reads .claude/chain/assess-dashboard.json and pulls the lowest-scoring dimension and high-priority improvements (effort ≤ 2 AND impact ≥ 4) without parsing markdown tables. Measured: assess spec ≈ 830 tokens vs ~3500 token markdown for the same content.

Self-Reported Uncertainty (Opus 4.7 only, `xhigh` effort)

Opus 4.7 is materially better than 4.6 at honestly reporting its own limits. When xhigh effort is active, enrich each dimension's rating with a confidence level and a list of caveats — things the model couldn't verify, assumptions it relied on, or cases it didn't test.

Output schema per dimension (JSON):

{
  "dimension": "security",
  "score": 7.2,
  "confidence": "medium",              // "low" | "medium" | "high"
  "caveats": [
    "Didn't execute the SQL queries against a real DB to confirm parameterization",
    "Assumed NODE_ENV=production in deployment; didn't verify CI config",
    "Reviewed 12 of 15 handlers; remaining 3 deferred by scope filter"
  ],
  "evidence": ["src/api/auth.ts:42", "src/middleware/guard.ts:88"]
}

Rules:

Do not use confidence as an auto-gate. It's a signal for the human reader, not a pass/fail threshold.
caveats must be specific. "Didn't check X" with file paths beats "uncertainty about security".
If a caveat is cheap to resolve, resolve it instead of recording it. Caveats are for things that genuinely can't be verified within the skill's scope (e.g., production runtime behavior, future input patterns).
Composite score still computes from score only — not weighted by confidence — to keep the number comparable across runs.

Grade Interpretation

Load Read("${CLAUDE_PLUGIN_ROOT}/skills/quality-gates/references/unified-scoring-framework.md") for grade thresholds and scoring criteria.

Key Decisions

Decision	Choice	Rationale
7 dimensions	Comprehensive coverage	All quality aspects without overwhelming
0-10 scale	Industry standard	Easy to understand and compare
Parallel assessment	4 agents (7 dimensions)	Fast, thorough evaluation
Effort/Impact scoring	1-5 scale	Simple prioritization math

Rules Quick Reference

Rule	Impact	What It Covers
complexity-metrics (load `${CLAUDE_SKILL_DIR}/rules/complexity-metrics.md`)	HIGH	7-criterion scoring (1-5), complexity levels, thresholds
complexity-breakdown (load `${CLAUDE_SKILL_DIR}/rules/complexity-breakdown.md`)	HIGH	Task decomposition strategies, risk assessment

Related Skills

ork:verify - Post-implementation verification
ork:code-review-playbook - Code review patterns
ork:quality-gates - Task complexity assessment, gate patterns

Version: 1.7.0 (April 2026) — ${CLAUDE_EFFORT} env var as primary effort signal (CC 2.1.120, #1540)

name	assess
license	MIT
compatibility	Claude Code 2.1.76+. Requires memory MCP server.
description	Assesses and rates quality 0-10 across multiple dimensions (correctness, maintainability, security, performance, testability, simplicity) with pros/cons analysis. Compares against project conventions and prior decisions from memory. Produces structured evaluation reports with actionable improvement suggestions. Use when evaluating code, designs, architectures, or comparing alternative approaches.
context	fork
version	1.7.0
author	OrchestKit
tags	["assessment","evaluation","quality","comparison","pros-cons","rating"]
user-invocable	true
allowed-tools	["AskUserQuestion","Read","Grep","Glob","Task","TaskCreate","TaskUpdate","TaskList","ToolSearch","mcp__memory__search_nodes","Bash"]
skills	["code-review-playbook","quality-gates","architecture-decision-record","memory","chain-patterns"]
argument-hint	[code-path-or-topic] [--render=markdown\|json-render\|both] [--effort=low\|medium\|high\|xhigh]
complexity	high
persuasion-type	guidance
effort	high
model	sonnet
hooks	{"PreToolUse":[{"matcher":"Read","command":"${CLAUDE_PLUGIN_ROOT}/hooks/bin/run-hook.mjs skill/assessment-baseline-loader","once":true}]}
metadata	{"category":"document-asset-creation","mcp-server":"memory"}
triggers	{"keywords":["assess","asses","rate","evaluate","grade","score","compare","how good","how bad","red flags","trade-offs","pros and cons","good enough"],"examples":["rate this code from 0 to 10","is this approach good enough for production?","evaluate the trade-offs between Redis vs Postgres"],"anti-triggers":["fix","implement","build","test","commit","review pr","explore"]}

Assess

Comprehensive assessment skill for answering "is this good?" with structured evaluation, scoring, and actionable recommendations.

Quick Start

/ork:assess backend/app/services/auth.py
/ork:assess our caching strategy
/ork:assess --model=opus the current database schema
/ork:assess frontend/src/components/Dashboard

Effort levels (CC 2.1.111+ adds `xhigh`)

Effort	Behavior
`low` / `medium`	Subset of dimensions, faster turnaround
`high` (default)	All six dimensions with pros/cons
`xhigh` (Opus 4.7 only)	All six dimensions + one additional assessor pass focused on uncertainty/caveats; emits `confidence` per dimension

xhigh silently falls back to high on non-Opus-4.7 models. /ork:doctor warns when xhigh is used without Opus 4.7.

Argument Resolution

TARGET = "$ARGUMENTS"  # Full argument string, e.g., "backend/app/services/auth.py"
# $ARGUMENTS[0] is the first token (CC 2.1.59 indexed access)

# Model override detection (CC 2.1.72)
MODEL_OVERRIDE = None
for token in "$ARGUMENTS".split():
    if token.startswith("--model="):
        MODEL_OVERRIDE = token.split("=", 1)[1]  # "opus", "sonnet", "haiku"
        TARGET = TARGET.replace(token, "").strip()

Pass MODEL_OVERRIDE to all Agent() calls via model=MODEL_OVERRIDE when set. Accepts symbolic names (opus, sonnet, haiku) or full IDs (claude-opus-4-6) per CC 2.1.74.

Effort detection (CC 2.1.120+)

${CLAUDE_EFFORT} is the primary signal. CC 2.1.120 sets this env var from /effort or the model picker. --effort= token in $ARGUMENTS is the explicit override fallback (also covers older CC).

# Read env first (CC 2.1.120+), then check explicit override
EFFORT = os.environ.get("CLAUDE_EFFORT")  # "low" | "medium" | "high" | "xhigh" | None
for token in "$ARGUMENTS".split():
    if token.startswith("--effort="):
        EFFORT = token.split("=", 1)[1]   # explicit override wins
        TARGET = TARGET.replace(token, "").strip()
EFFORT = EFFORT or "high"  # default when CC < 2.1.120 and no flag

STEP -1: MCP Probe + Resume Check

Load: Read("${CLAUDE_PLUGIN_ROOT}/skills/chain-patterns/references/mcp-detection.md")

# 1. Probe MCP servers (once at skill start)
# memory is alwaysLoad in .mcp.json (CC 2.1.121+, #1541) — probe below kept as fallback for older CC:
ToolSearch(query="select:mcp__memory__search_nodes")

# 2. Store capabilities
Write(".claude/chain/capabilities.json", {
  "memory": probe_memory.found,
  "skill": "assess",
  "timestamp": now()
})

# 3. Check for resume
state = Read(".claude/chain/state.json")  # may not exist
if state.skill == "assess" and state.status == "in_progress":
    last_handoff = Read(f".claude/chain/{state.last_handoff}")

Phase Handoffs

Phase	Handoff File	Contents
0	`00-intent.json`	Dimensions, target, mode
1	`01-baseline.json`	Initial codebase scan results
2	`02-evaluation.json`	Per-dimension scores + evidence
3	`03-report.json`	Final report, grade, recommendations

STEP 0: Verify User Intent with AskUserQuestion

BEFORE creating tasks, clarify assessment dimensions:

AskUserQuestion(
  questions=[{
    "question": "What dimensions to assess?",
    "header": "Dimensions",
    "options": [
      {"label": "Full assessment (Recommended)", "description": "All dimensions: quality, maintainability, security, performance", "markdown": "```\nFull Assessment (7 phases)\n──────────────────────────\n  Dimensions scored 0-10:\n  ┌─────────────────────────────┐\n  │ Correctness      ████████░░ │\n  │ Maintainability  ██████░░░░ │\n  │ Security         █████████░ │\n  │ Performance      ███████░░░ │\n  │ Testability      ██████░░░░ │\n  │ Architecture     ████████░░ │\n  │ Documentation    █████░░░░░ │\n  └─────────────────────────────┘\n  + Pros/cons + alternatives\n  + Effort estimates + report\n  Agents: 4 parallel evaluators\n```"},
      {"label": "Code quality only", "description": "Readability, complexity, best practices", "markdown": "```\nCode Quality Focus\n──────────────────\n  Dimensions scored 0-10:\n  ┌─────────────────────────────┐\n  │ Correctness      ████████░░ │\n  │ Maintainability  ██████░░░░ │\n  │ Testability      ██████░░░░ │\n  └─────────────────────────────┘\n  Skip: security, performance\n  Agents: 1 code-quality-reviewer\n  Output: Score + best practice gaps\n```"},
      {"label": "Security focus", "description": "Vulnerabilities, attack surface, compliance", "markdown": "```\nSecurity Focus\n──────────────\n  ┌──────────────────────────┐\n  │ OWASP Top 10 check       │\n  │ Dependency CVE scan       │\n  │ Auth/AuthZ flow review    │\n  │ Data flow tracing         │\n  │ Secrets detection         │\n  └──────────────────────────┘\n  Agent: security-auditor\n  Output: Vuln list + severity\n          + remediation steps\n```"},
      {"label": "Quick score", "description": "Just give me a 0-10 score with brief notes", "markdown": "```\nQuick Score\n───────────\n  Single pass, ~2 min:\n\n  Read target ──▶ Score ──▶ Done\n                  7.2/10\n\n  Output:\n  ├── Composite score (0-10)\n  ├── Grade (A-F)\n  ├── 3 strengths\n  └── 3 improvements\n  No agents, no deep analysis\n```"}
    ],
    "multiSelect": false
  }]
)

Based on answer, adjust workflow:

Full assessment: All 7 phases, parallel agents
Code quality only: Skip security and performance phases
Security focus: Prioritize security-auditor agent
Quick score: Single pass, brief output

STEP 0b: Select Orchestration Mode

Load details: Read("${CLAUDE_SKILL_DIR}/references/orchestration-mode.md") for env var check logic, Agent Teams vs Task Tool comparison, and mode selection rules.

Task Management (CC 2.1.16)

# 1. Create main task IMMEDIATELY
TaskCreate(
  subject="Assess: {target}",
  description="Comprehensive evaluation with quality scores and recommendations",
  activeForm="Assessing {target}"
)

# 2. Create subtasks for each assessment phase
TaskCreate(subject="Understand target and gather context", activeForm="Understanding target")   # id=2
TaskCreate(subject="Discover scope and build file list", activeForm="Discovering scope")        # id=3
TaskCreate(subject="Rate quality across 7 dimensions", activeForm="Rating quality")             # id=4
TaskCreate(subject="Analyze pros and cons", activeForm="Analyzing pros/cons")                   # id=5
TaskCreate(subject="Compare alternatives", activeForm="Comparing alternatives")                 # id=6
TaskCreate(subject="Generate improvement suggestions", activeForm="Generating suggestions")     # id=7
TaskCreate(subject="Compile assessment report", activeForm="Compiling report")                  # id=8

# 3. Set dependencies for sequential phases
TaskUpdate(taskId="3", addBlockedBy=["2"])  # Scope needs target understanding
TaskUpdate(taskId="4", addBlockedBy=["3"])  # Rating needs scoped file list
TaskUpdate(taskId="5", addBlockedBy=["4"])  # Pros/cons needs quality scores
TaskUpdate(taskId="6", addBlockedBy=["4"])  # Alternatives need quality scores
TaskUpdate(taskId="7", addBlockedBy=["5", "6"])  # Suggestions need analysis
TaskUpdate(taskId="8", addBlockedBy=["7"])  # Report needs suggestions

# 4. Before starting each task, verify it's unblocked
task = TaskGet(taskId="2")  # Verify blockedBy is empty

# 5. Update status as you progress
TaskUpdate(taskId="2", status="in_progress")  # When starting
TaskUpdate(taskId="2", status="completed")    # When done — repeat for each subtask

What This Skill Answers

Question	How It's Answered
"Is this good?"	Quality score 0-10 with reasoning
"What are the trade-offs?"	Structured pros/cons list
"Should we change this?"	Improvement suggestions with effort
"What are the alternatives?"	Comparison with scores
"Where should we focus?"	Prioritized recommendations

Workflow Overview

Phase	Activities	Output
1. Target Understanding	Read code/design, identify scope	Context summary
1.5. Scope Discovery	Build bounded file list	Scoped file list
2. Quality Rating	7-dimension scoring (0-10)	Scores with reasoning
3. Pros/Cons Analysis	Strengths and weaknesses	Balanced evaluation
4. Alternative Comparison	Score alternatives	Comparison matrix
5. Improvement Suggestions	Actionable recommendations	Prioritized list
6. Effort Estimation	Time and complexity estimates	Effort breakdown
7. Assessment Report	Compile findings	Final report

Phase 1: Target Understanding

Identify what's being assessed and gather context:

# PARALLEL - Gather context
Read(file_path="$ARGUMENTS[0]")  # If file path
Grep(pattern="$ARGUMENTS[0]", output_mode="files_with_matches")
mcp__memory__search_nodes(query="$ARGUMENTS[0]")  # Past decisions

Phase 1.5: Scope Discovery

Progressive Output (CC 2.1.76)

Output results incrementally as each evaluation phase completes:

After Phase	Show User
1. Target Understanding	Scope summary, file list, context
1.5. Scope Discovery	Bounded file list (max 30 files)
2. Quality Rating	Each dimension's score as the evaluating agent returns
3. Pros/Cons	Balanced evaluation summary

Phase 2: Quality Rating (7 Dimensions)

Load Read("${CLAUDE_SKILL_DIR}/references/agent-spawn-definitions.md") for Task Tool mode spawn patterns and Agent Teams alternative.

Composite Score: Weighted average of all 7 dimensions (see quality-model.md).

Phases 3-7: Analysis, Comparison & Report

Load Read("${CLAUDE_SKILL_DIR}/references/phase-templates.md") for output templates for pros/cons, alternatives, improvements, effort, and the final report.

See also: Read("${CLAUDE_SKILL_DIR}/references/alternative-analysis.md") | Read("${CLAUDE_SKILL_DIR}/references/improvement-prioritization.md")

Phase 7b: Emit Dashboard Spec (json-render)

Parse --render= from $ARGUMENTS. Default is both.

Mode	Behavior
`markdown`	Current behavior — markdown assessment report only. No spec emitted.
`json-render`	Emit `.claude/chain/assess-dashboard.json` only. Skip markdown report.
`both`	Emit spec and markdown. Default — human reads the report, downstream skills parse the spec.

When emitting a spec:

Load format and catalog: Read("${CLAUDE_SKILL_DIR}/references/dashboard-spec.md"). Example: references/dashboard-example.json.
Build the spec using only catalog types: Card, StatGrid, DataTable, StatusBadge, BarMeter, Markdown. Top-level fields composite (number) and grade (string) are required for assess specs.
One BarMeter per dimension scored. The verdict element is a StatusBadge with status success/warning/error mapped from grade (A/B → success, C → warning, D/F → error).
Write to .claude/chain/assess-dashboard.json with compact JSON.
Validate before declaring success:

node "${CLAUDE_SKILL_DIR}/scripts/render-spec.mjs" .claude/chain/assess-dashboard.json --check

If validation fails, fall back to markdown-only and surface the error. Never write a partial spec.

For --render=both, render the markdown view from the spec:

node "${CLAUDE_SKILL_DIR}/scripts/render-spec.mjs" .claude/chain/assess-dashboard.json

This guarantees JSON spec and markdown report stay in sync.

Self-Reported Uncertainty (Opus 4.7 only, `xhigh` effort)

Output schema per dimension (JSON):

{
  "dimension": "security",
  "score": 7.2,
  "confidence": "medium",              // "low" | "medium" | "high"
  "caveats": [
    "Didn't execute the SQL queries against a real DB to confirm parameterization",
    "Assumed NODE_ENV=production in deployment; didn't verify CI config",
    "Reviewed 12 of 15 handlers; remaining 3 deferred by scope filter"
  ],
  "evidence": ["src/api/auth.ts:42", "src/middleware/guard.ts:88"]
}

Rules:

Do not use confidence as an auto-gate. It's a signal for the human reader, not a pass/fail threshold.
caveats must be specific. "Didn't check X" with file paths beats "uncertainty about security".
If a caveat is cheap to resolve, resolve it instead of recording it. Caveats are for things that genuinely can't be verified within the skill's scope (e.g., production runtime behavior, future input patterns).
Composite score still computes from score only — not weighted by confidence — to keep the number comparable across runs.

Grade Interpretation

Load Read("${CLAUDE_PLUGIN_ROOT}/skills/quality-gates/references/unified-scoring-framework.md") for grade thresholds and scoring criteria.

Key Decisions

Decision	Choice	Rationale
7 dimensions	Comprehensive coverage	All quality aspects without overwhelming
0-10 scale	Industry standard	Easy to understand and compare
Parallel assessment	4 agents (7 dimensions)	Fast, thorough evaluation
Effort/Impact scoring	1-5 scale	Simple prioritization math

Rules Quick Reference

Rule	Impact	What It Covers
complexity-metrics (load `${CLAUDE_SKILL_DIR}/rules/complexity-metrics.md`)	HIGH	7-criterion scoring (1-5), complexity levels, thresholds
complexity-breakdown (load `${CLAUDE_SKILL_DIR}/rules/complexity-breakdown.md`)	HIGH	Task decomposition strategies, risk assessment

Related Skills

ork:verify - Post-implementation verification
ork:code-review-playbook - Code review patterns
ork:quality-gates - Task complexity assessment, gate patterns

Version: 1.7.0 (April 2026) — ${CLAUDE_EFFORT} env var as primary effort signal (CC 2.1.120, #1540)

assess

Assess

Quick Start

Effort levels (CC 2.1.111+ adds xhigh)

Argument Resolution

Effort detection (CC 2.1.120+)

STEP -1: MCP Probe + Resume Check

Phase Handoffs

STEP 0: Verify User Intent with AskUserQuestion

STEP 0b: Select Orchestration Mode

Task Management (CC 2.1.16)

What This Skill Answers

Workflow Overview

Phase 1: Target Understanding

Phase 1.5: Scope Discovery

Progressive Output (CC 2.1.76)

Phase 2: Quality Rating (7 Dimensions)

Phases 3-7: Analysis, Comparison & Report

Phase 7b: Emit Dashboard Spec (json-render)

Self-Reported Uncertainty (Opus 4.7 only, xhigh effort)

Grade Interpretation

Key Decisions

Rules Quick Reference

Related Skills

Assess

Quick Start

Effort levels (CC 2.1.111+ adds xhigh)

Argument Resolution

Effort detection (CC 2.1.120+)

STEP -1: MCP Probe + Resume Check

Phase Handoffs

STEP 0: Verify User Intent with AskUserQuestion

STEP 0b: Select Orchestration Mode

Task Management (CC 2.1.16)

What This Skill Answers

Workflow Overview

Phase 1: Target Understanding

Phase 1.5: Scope Discovery

Progressive Output (CC 2.1.76)

Phase 2: Quality Rating (7 Dimensions)

Phases 3-7: Analysis, Comparison & Report

Phase 7b: Emit Dashboard Spec (json-render)

Self-Reported Uncertainty (Opus 4.7 only, xhigh effort)

Grade Interpretation

Key Decisions

Rules Quick Reference

Related Skills

Effort levels (CC 2.1.111+ adds `xhigh`)

Self-Reported Uncertainty (Opus 4.7 only, `xhigh` effort)

Effort levels (CC 2.1.111+ adds `xhigh`)

Self-Reported Uncertainty (Opus 4.7 only, `xhigh` effort)