| name | craft |
| type | skill |
| category | meta |
| description | Instruction quality gate — reviews agent instructions (task bodies, workflow steps, skill procedures, self-test protocols) for shallow-execution vulnerabilities before deployment. Two modes: author (pre-hoc review) and audit (trace a failure back to the instruction gap). The bar is excellence, not compliance. |
| triggers | ["craft","review these instructions","instruction quality","are these instructions good enough","raise the bar","why did the agent miss this"] |
| modifies_files | true |
| needs_task | false |
| mode | conversational |
| domain | ["meta","framework","quality-assurance"] |
| owner | pauli |
| allowed-tools | Read, Grep, Glob, Bash, Edit, Write, Agent |
| model | opus |
| version | 0.1.0 |
| permalink | skills-craft |
Instruction Craftsmanship Guidelines
Review and audit agent-facing instructions (tasks, workflows, self-tests) to eliminate shallow-execution vulnerabilities.
Modes of Operation
- Author Mode: Review proposed instructions before deployment to catch execution gaps.
- Audit Mode: Analyze execution transcripts after a failure to trace it back to instruction gaps.
Quality Criteria: The Defect Classes
Ensure instructions are free of the following defects:
- Compliance Framing: Avoid instructions defined as "did X run?". Require outcome-based verification ("is the output correct, complete, and verified?").
- Missing Artifact Chain: Ensure all output channels (stdout, stderr, log files, JSONL transcripts, schema validations) are checked, not just the primary summary channel.
- No Adversarial Checks: Explicitly check for silent failures (e.g., zero exit code on empty/corrupt outputs, config warn-instead-of-block overrides).
- Summary-as-Evidence: Prohibit using agent summaries or claims as proof of success. Require direct inspection of actual artifacts.
- Undefined Boundary Behavior: Explicitly define fallback search spaces or escalation procedures when standard searches return no results.
- Skimped Verification: Require reading the complete output of files rather than simple grepping, keyword matching, or tail scans.
- No Negative Verification: Check for the absence of unexpected outputs (corruption, credentials leak, placeholders) in addition to the presence of expected results.
- Deferred-Read Dispersion: A rule the agent needs at the moment of action lives in a second file it must go read — a "see
X.md", a [[link]] to "canonical" doctrine, a "read first" pointer. An agent that already has the instructions in hand frequently will not make the follow-up read, so a load-bearing rule behind a pointer is a rule that often won't run. Keep the operative instruction where it executes and inline it; reserve pointers for genuinely optional depth, never for a step that is required every time. Shorter, co-located instructions beat longer, more distributed ones in almost all cases — when content is mandatory, fold it in and tighten rather than forking it into a referenced file (and never duplicate it across both the summary and the linked file, which is the worst of both). Audit-mode tell: the executing instruction was a pointer, and the missed rule lived one read away.
These are common patterns, not an exhaustive list. If instructions feel shallow but match no named defect, trust the feeling, say so, and articulate why — and remember depth is verification specificity, not step count.
Workflow
Author Mode Workflow
- Assess the target instructions against the defect classes.
- Quote any text exhibiting a defect and write a high-depth rewrite.
- Output a verdict: SHIP (no defects), REVISE (defects found, edit file in-place with fixes), or REJECT (fundamental redesign needed).
Audit Mode Workflow
- Identify what the agent missed and locate the executing instruction.
- Classify the instruction gap under the defect classes.
- Edit the instruction in-place with a rewrite to prevent the failure.
Output Expectations
- Respond with structured, direct reviews or audits. Keep lists and verdicts highly concise, citing exact line differences where revisions are made.