with one click
dkreview
// Run an adversarial implementation review using tooling, semantic analysis, dependency tracing, and acceptance criteria verification.
// Run an adversarial implementation review using tooling, semantic analysis, dependency tracing, and acceptance criteria verification.
[HINT] Download the complete skill directory including SKILL.md and all related files
| name | dkreview |
| description | Run an adversarial implementation review using tooling, semantic analysis, dependency tracing, and acceptance criteria verification. |
Agentic implementation review that combines deterministic tooling with semantic analysis, dependency tracing, and acceptance criteria verification.
/dkverify/dkreview is the user-facing review command. When invoked directly by a user
without --single-pass or --no-loop, immediately invoke the Skill tool with
skill: dkreviewloop, then stop. This gives users the 3-clean-pass review loop
without requiring them to remember a separate command.
Run the single-pass review instructions below only when one of these is true:
--single-pass or --no-loop/dkreviewloop/dkreviewloopIf you are unsure whether this is a direct user invocation or an internal loop
iteration, prefer the loop and invoke dkreviewloop.
Four phases, executed sequentially. Phase 0 (Codebase Context) is a sub-step of Phase 0 โ it runs FIRST so all subsequent passes have the context they need.
Run machine checks first. Don't waste semantic review effort on issues linters catch.
Before scope analysis, gather the context you need to judge findings. Skipping this is the primary source of false positives โ what looks like a bug in isolation is often an established pattern when read in context.
Read in this order โ stop early when you have enough to judge the change:
AGENTS.md, plus CLAUDE.md compatibility pointers (root and any nested) โ language boundaries, naming, error-handling, architecture rules.doyaken/rules/*.md files referenced from AGENTS.md / CLAUDE.md.doyaken/doyaken.md ยง Reviewers and any project-specific review-criteria sections โ some projects extend or override the defaultsgit log --oneline --since=3.months -- <file> for each deep-review file. Recent fix: commits = fragile area, apply extra scrutiny.Grep for existing instances. If pattern X is established (3+ occurrences) and the change introduces Y instead, that's a finding. If the change uses an "unusual" pattern that turns out to match an established convention, the finding is filtered.prompts/failure-recovery.md and any .debt ledger files โ debt items already accepted should not be re-raisedWhen you flag a finding, the report MUST cite which Phase 0 artefact you checked (e.g., "AGENTS.md says all hooks must use set -euo pipefail; this hook does not"). Findings without a Phase 0 anchor are downgraded.
Detect the default branch using the shared library function (see lib/git.sh):
source "${DOYAKEN_DIR:-$HOME/work/doyaken}/lib/common.sh"
DEFAULT_BRANCH=$(dk_default_branch)
git diff origin/$DEFAULT_BRANCH...HEAD --stat
git diff origin/$DEFAULT_BRANCH...HEAD --name-only
git log origin/$DEFAULT_BRANCH..HEAD --oneline
Classify every changed file by its role in the project (e.g., business logic, test, config, migration, documentation, generated code). Use the project's directory structure and naming conventions to determine categories.
Count files and lines changed per category. This drives review depth in Phase 1.
If the change is trivial, skip the full semantic review (Phases 1-2) and go straight to Phase 0.2 deterministic checks only:
Skip semantic review when ALL of these are true:
*auth*, *permission*, *policy*, *secret*, *token*, *credential*, *session*, *middleware*, *guard*, *rls*, *acl*, *.env*; any migration file; any file under directories like security/, access/, iam/Examples that skip: README typo fix, package version bump, config value change, .gitignore update.
Examples that DON'T skip: New endpoint (even if small), migration, test changes, any file matching the security-sensitive patterns above, any new middleware or policy file.
Discover the project's quality tools (same approach as /dkverify) and run checks only for packages with changed files (not the entire repo).
For each tool discovered (formatter, linter, type checker, test runner), run it scoped to the affected area. Record failures as findings.
If the project uses code generation (detectable from Makefile targets, package.json scripts, or generator config files), and any source files that feed the generator changed, run the generator and check for uncommitted changes. Stage any changes and record as a finding.
For each changed file, assign review depth and applicable passes. This is internal โ don't present to the user.
Deep (full-file read + dependency trace + git history):
Shallow (diff-only scan):
Select which review passes apply per file. If the plan includes task risk levels (from /dkplan), use them to adjust review depth:
Risk-proportional pass selection (when task risk metadata is available):
| Risk | Passes | Notes |
|---|---|---|
| LOW | A, F, H | Correctness, Style, Acceptance Criteria only |
| MEDIUM | A, B, C, E, F, G, H, J | Skip D (Performance), I (Documentation), K (Observability), L (BC) unless they apply to the touched files |
| HIGH | All 12 passes | Full dependency trace + git history context |
If no risk metadata is available, fall back to the file-based assignment below.
File-based pass assignment (default):
| Pass | Applies when |
|---|---|
| A: Correctness & Logic | Business logic, services, data access, anything with branching or async |
| B: Design & Architecture | New files, refactored code, complex changes, new boundaries |
| C: Security | Auth, access control, external APIs, data access, file uploads, deserialization |
| D: Performance | Database queries, list endpoints, loops, external calls, hot paths |
| E: Testing | Test files, plus any production code with new branches |
| F: Style & Conventions | All files |
| G: Dependency Consistency | Types, schemas, API contracts, migrations, anything imported from elsewhere |
| H: Acceptance Criteria | All production code (if ticket context available) |
| I: Documentation Quality | All files with non-obvious logic; new public APIs |
| J: Holistic Consistency | All changed files (cross-file pass) |
| K: Observability | Production code paths (not docs/tests); new error paths or background jobs |
| L: Backward Compatibility | Public APIs, CLI, DB schema, event/wire formats, config formats, defaults |
Read the review criteria from prompts/review.md for the 12-pass criteria (A-L), the Phase 0 codebase-context preamble, the Observe-Verify-Conclude protocol, and confidence scoring guidelines. Project-specific extensions live in AGENTS.md, CLAUDE.md, and .doyaken/doyaken.md.
Before reviewing individual files, scan ALL changed files at the diff level to build a mental model of the full change:
Grep for similar features)Record cross-file assumptions โ these will be verified in Phase 2.2-2.3 and 2.8.
For every changed file, Read the entire file โ not just the diff. Evaluate the change within its full context. The diff tells you what changed; the full file tells you whether the change fits.
For each file under deep review, record observations before forming conclusions:
This sequence exists because code review is susceptible to motivated reasoning โ forming a conclusion ("this is a bug") then constructing justification backward. Observations first, conclusions second.
For each changed function, class, type, or export, trace consumers up to 3 hops deep:
Grep for imports referencing the changed file โ verify each direct consumer is consistentGrep for ITS consumers โ verify consistencyTrack the chain: [Changed] X โ [Hop 1] Y โ โ [Hop 2] Z โ
Answer: "Are all consumers of this changed API updated consistently, including transitive consumers?"
If a function signature changed but callers weren't updated โ finding. If a type added a field but serializers don't map it โ finding. If a type narrowed but consumers still pass the old shape โ finding. If a transitive consumer depends on the old behavior โ finding.
When files that define contracts (types, schemas, API definitions, database models) change, verify that all dependent code is consistent. Use Grep to find imports of the changed file, then verify each consumer handles the change.
For deep-review files only:
git log --oneline -10 -- <file>
If recent fix: commits exist โ flag as fragile area. Apply extra scrutiny on correctness (Pass A).
If ticket context is available (from /dkplan or /doyaken โ check task list or plan file):
For each acceptance criterion:
Flag unmet criteria as severity: high findings.
Semantic layer on top of linting โ catches what machines miss:
Execute Pass I from prompts/review.md on all files with non-obvious logic:
Execute Pass K from prompts/review.md on production code paths:
If the project has no observability tooling, downgrade these findings to suggestions.
Execute Pass L from prompts/review.md if the change touches a public contract (HTTP API, CLI, library API, DB schema, event/wire format, config format):
Execute Pass J from prompts/review.md across ALL changed files as a set:
Maximum 3 iterations. After each fix round:
On 2nd iteration with recurring findings: Read prompts/failure-recovery.md and run the failure analysis on each finding that appeared in the previous iteration. Log your recovery decision before proceeding. If findings are accepted as debt, record them in the debt ledger and remove them from the active findings list.
After 3 iterations, report remaining findings and proceed to /dkverify.
Print a structured report at the end:
## Self-Review Report
### Scope
- Files changed: X (by category breakdown)
- Review depth: X deep, Y shallow
### Deterministic Checks
- Format: PASS | Lint: PASS | Typecheck: PASS | Tests: PASS
### Findings (confidence >= 50 only)
| # | Severity | Confidence | File:Line | Pass | Issue | Fix Applied |
|---|----------|------------|-----------|------|-------|-------------|
### Filtered Out
- X finding(s) below confidence threshold (< 50)
### Dependency Trace
- [Changed] foo.ts โ [Checked] bar.ts โ, baz.ts โ
### Acceptance Criteria
- [x] Criterion 1: implemented + tested
- [ ] Criterion 2: NOT FOUND
### Iterations: N/3
### Result: PASS | PASS WITH WARNINGS | NEEDS ATTENTION
Result meanings:
/dkverify but flag to the user./dkverify. Self-review fixes issues; verify confirms they're fixed. The duplication is intentional.