with one click
harness-setup
// [Quality] Use when setting up an agent quality harness with feedforward guides and feedback sensors.
// [Quality] Use when setting up an agent quality harness with feedforward guides and feedback sensors.
[HINT] Download the complete skill directory including SKILL.md and all related files
| name | harness-setup |
| version | 1.0.0 |
| description | [Quality] Use when setting up an agent quality harness with feedforward guides and feedback sensors. |
[BLOCKING] Execute skill steps in declared order. NEVER skip, reorder, or merge steps without explicit user approval. [BLOCKING] Before each step or sub-skill call, update task tracking: set
in_progresswhen step starts, setcompletedwhen step ends. [BLOCKING] Every completed/skipped step MUST include brief evidence or explicit skip reason. [BLOCKING] If Task tools are unavailable, create and maintain an equivalent step-by-step plan tracker with the same status transitions.
Goal: Set up the complete outer agent harness for a greenfield project so all subsequent AI coding agents operate with maximum guidance and earliest-possible quality feedback.
What this produces:
/linter-setup (linters, formatters, pre-commit hooks, CI gates).ai/workspace/harness/harness-inventory.mdWhen invoked: After /scaffold and /linter-setup in the greenfield workflow. Assumes scaffolding is complete.
What it does NOT do: Install linters or configure formatters — that is /linter-setup's responsibility.
Check 1 — Linter-setup prerequisite (BLOCK if missing):
Before running any phases, verify /linter-setup has completed by checking for:
.eslintrc, pyproject.toml, .editorconfig).husky/, .pre-commit-config.yaml)If any of these are missing → AskUserQuestion: "/linter-setup appears incomplete. Computational feedback sensors must be in place before harness setup. Run /linter-setup first, then return here?"
BLOCK Phase A/B/C/D/E until linter-setup verification passes.
Check 2 — Existing harness inventory:
Check for .ai/workspace/harness/harness-inventory.md
AskUserQuestion: "Harness inventory already exists — re-run to enhance existing harness, or skip?"CLAUDE.md/AGENTS.md presence — those are feedforward guides this skill may enhance, not signals to skipRead from: plan.md frontmatter → architecture-design report → tech-stack-comparison report.
Extract:
Write detection result to .ai/workspace/harness/stack-profile.md.
If any field undetectable → AskUserQuestion to confirm before proceeding.
For each guide type, check if it exists; if not, create or enhance:
1. CLAUDE.md / AGENTS.md — Architecture conventions
/architecture-design (e.g., Clean Architecture, CQRS, Repository)2. Skill activation rules
/review-domain-entities"/code-review"3. Architecture notes
docs/architecture/ with:
bounded-contexts.md — domain boundaries and ownershipdependency-rules.md — allowed import directions between layersnaming-conventions.md — project-specific naming for files, classes, functions4. Pattern catalog
docs/architecture/pattern-catalog.md/architecture-design with DO/DON'T examplesPresent list of guides created/updated via AskUserQuestion: "Feedforward guides above will be created/enhanced. Confirm or adjust?"
Confirm /linter-setup has completed:
.eslintrc, pyproject.toml, .editorconfig).husky/, .pre-commit-config.yaml)If any missing → invoke /linter-setup before continuing.
Output: confirmation that computational sensors are in place, with file paths listed.
Configure which AI review skills fire at each lifecycle stage. Present to user via AskUserQuestion:
"Which inferential sensors should be mandatory vs optional for this project?"
Pre-implementation (planning gate):
/why-review — validate design rationale before committing to implementation approachPre-commit (lightweight review):
/code-review before committing significant changesPost-implementation (domain model changes):
/review-domain-entities — when domain entity files are in the changesetPre-release (mandatory gates):
/sre-review — reliability and operational readiness/security — security review before production releaseRecurring drift detection:
/scan-codebase-health — schedule quarterly (or on CI schedule) to detect driftAdd the agreed sensor configuration to CLAUDE.md under "## Review Gates".
Define the project's behaviour harness plan:
Functional spec format:
AskUserQuestion: "Feature documentation format?" Options: feature-docs (17-section), TDD specs only, lightweight ADRsdocs/business-features/ or equivalent spec homeTest strategy pyramid:
Approved fixtures pattern:
Coverage threshold:
AskUserQuestion: "Minimum test coverage threshold for CI gate?" (Recommended: 80% line coverage)Document agreed test strategy to docs/architecture/test-strategy.md.
Write .ai/workspace/harness/harness-inventory.md:
# Harness Inventory
Generated: {date}
Stack: {detected stack from Phase A}
## Feedforward Guides
| Type | File/Skill | Purpose |
| ------------- | ------------------------------------ | ------------------------------- |
| Inferential | CLAUDE.md §Architecture Patterns | Shapes AI architectural choices |
| Inferential | CLAUDE.md §Anti-Patterns | Prevents known bad patterns |
| Inferential | docs/architecture/pattern-catalog.md | DO/DON'T examples per pattern |
| Computational | .editorconfig | Cross-IDE consistency |
## Feedback Sensors — Computational
| Stage | Tool/Hook | What it catches |
| ---------- | ----------------- | ------------------------------- |
| Pre-commit | {linter} | Style violations, common errors |
| Pre-commit | {formatter} | Code formatting drift |
| CI | {type-checker} | Type errors |
| CI | {static-analyzer} | Security, complexity, dead code |
## Feedback Sensors — Inferential
| Stage | Skill/Agent | What it catches |
| ------------------- | ----------------------- | ------------------------------ |
| Pre-implementation | /why-review | Design rationale gaps |
| Pre-commit | /code-review | Convention drift, logic errors |
| Post-implementation | /review-domain-entities | Domain model quality |
| Pre-release | /sre-review | Operational readiness |
| Pre-release | /security | Security vulnerabilities |
## Open Gaps
| Area | Reason | Risk |
| ------------------------ | -------- | -------------- |
| {area not yet harnessed} | {reason} | {LOW/MED/HIGH} |
Present inventory to user for review via AskUserQuestion.
AskUserQuestion:
[IMPORTANT] Use
TaskCreateto break ALL work into small tasks BEFORE starting — including tasks for each file read. This prevents context loss from long files. For simple tasks, AI MUST ATTENTION ask user whether to skip.
Critical Thinking Mindset — Apply critical thinking, sequential thinking. Every claim needs traced proof, confidence >80% to act. Anti-hallucination: Never present guess as fact — cite sources for every claim, admit uncertainty freely, self-check output for errors, cross-reference independently, stay skeptical of own confidence — certainty without evidence root of all hallucination.
AI Mistake Prevention — Failure modes to avoid on every task:
Check downstream references before deleting. Deleting components causes documentation and code staleness cascades. Map all referencing files before removal. Verify AI-generated content against actual code. AI hallucinates APIs, class names, and method signatures. Always grep to confirm existence before documenting or referencing. Trace full dependency chain after edits. Changing a definition misses downstream variables and consumers derived from it. Always trace the full chain. Trace ALL code paths when verifying correctness. Confirming code exists is not confirming it executes. Always trace early exits, error branches, and conditional skips — not just happy path. When debugging, ask "whose responsibility?" before fixing. Trace whether bug is in caller (wrong data) or callee (wrong handling). Fix at responsible layer — never patch symptom site. Assume existing values are intentional — ask WHY before changing. Before changing any constant, limit, flag, or pattern: read comments, check git blame, examine surrounding code. Verify ALL affected outputs, not just the first. Changes touching multiple stacks require verifying EVERY output. One green check is not all green checks. Holistic-first debugging — resist nearest-attention trap. When investigating any failure, list EVERY precondition first (config, env vars, DB names, endpoints, DI registrations, data preconditions), then verify each against evidence before forming any code-layer hypothesis. Surgical changes — apply the diff test. Bug fix: every changed line must trace directly to the bug. Don't restyle or improve adjacent code. Enhancement task: implement improvements AND announce them explicitly. Surface ambiguity before coding — don't pick silently. If request has multiple interpretations, present each with effort estimate and ask. Never assume all-records, file-based, or more complex path.
Harness Engineering — An outer agent harness has two jobs: raise first-attempt quality + provide self-correction feedback loops before human review.
Controls split:
Axis Type Examples Frequency Feedforward Computational .editorconfig, strict compiler flags, enforced module boundariesAlways-on Feedforward Inferential CLAUDE.mdconventions, skill prompts, architecture notes, pattern catalogsAlways-on Feedback Computational Linters, type checks, pre-commit hooks, ArchUnit/arch-fitness tests, CI gates Pre-commit → CI Feedback Inferential /code-reviewskill,/sre-review,/security, LLM-as-judge passesPost-commit → CI Three harness types:
- Maintainability — Complexity, duplication, coverage, style. Easiest: rich deterministic tooling.
- Architecture fitness — Module boundaries, dependency direction, performance budgets, observability conventions.
- Behaviour — Functional correctness. Hardest: requires approved fixtures or strong spec-first discipline.
Keep quality left: pre-commit sensors fire first (cheap), CI sensors fire second, post-review last (expensive).
Research-driven: Never hardcode tool choices. Detect tech stack → research ecosystem → present top 2-3 options → user decides. Enforce strictest defaults; loosen only with explicit approval.
Harnessability signals: Strong typing, explicit module boundaries, opinionated frameworks = easier to harness. Treat these as greenfield architectural choices, not just style preferences.
IMPORTANT MUST ATTENTION follow declared step order for this skill; NEVER skip, reorder, or merge steps without explicit user approval
IMPORTANT MUST ATTENTION for every step/sub-skill call: set in_progress before execution, set completed after execution
IMPORTANT MUST ATTENTION every skipped step MUST include explicit reason; every completed step MUST include concise evidence
IMPORTANT MUST ATTENTION if Task tools unavailable, maintain an equivalent step-by-step plan tracker with synchronized statuses
MUST ATTENTION never auto-decide feedforward guide content — present draft and confirm with AskUserQuestion
MUST ATTENTION verify /linter-setup completed before Phase C passes
MUST ATTENTION write harness-inventory.md incrementally (append after each phase) — never hold in memory
MUST ATTENTION harness is a living document — update inventory when new sensors are added later
[TASK-PLANNING] Before acting, analyze task scope and break it into small todo tasks using TaskCreate.