Run any Skill in Manus with one click

execute-review-findings

Stars1

Forks0

UpdatedApril 18, 2026 at 23:14

Use when you have code review findings, PR comments, or review reports that need to be systematically addressed — especially when there are multiple findings across different files and severities

Installation

Install with Codex or Claude Copy this prompt, paste it into Codex, Claude, or another assistant, and let it review the skill page and install it for you.

Run Skill in Manus

Source

ahrav

ahrav/Gossip-rs

View GitHub Repository View Creator Repositories

Download

Run Skill in Manus

Related occupationsSOC

Based on SOC occupation classification

Software Quality Assurance Analysts and TestersComputer and Mathematical Occupations·SOC 15-1253

SKILL.md

readonly

Execute Review Findings

Systematically convert review findings into tracked tasks and execute them in priority waves with parallel agents where files don't overlap.

Core principle: Normalize findings → create self-contained beads tasks → analyze concurrency → execute in priority waves with parallel dispatch.

When to Use

After /review-dispatch produces a ranked findings report
After receiving multiple PR review comments
When handed an ad-hoc review document (security audit, perf report, etc.)
When a review produced 3+ findings that span multiple files

When NOT to Use

Single finding — just fix it directly
Findings are all in the same function — no parallelism benefit
You haven't read the code yet — understand first, then act

Invocation

/execute-review-findings <source>

Source options:

pr or pr:<number> — fetch PR comments via gh api
review — use the most recent /review-dispatch output in conversation
<file-path> — read a markdown review document from disk
(no argument) — prompt user to paste or describe findings

Phase 1: Gather & Normalize

Parse findings from any source into a uniform list. Each normalized finding has:

Field	Description
`id`	Sequential number (F1, F2, ...)
`severity`	MUST FIX / SHOULD FIX / CONSIDER / NIT
`type`	bug, performance, safety, documentation, design, complexity
`file`	File path(s) affected
`line`	Line number(s) or range
`summary`	One-line description
`detail`	Full finding with current → desired behavior
`specialists`	Which reviewers flagged it (if from `/review-dispatch`)

Parsing rules by source:

/review-dispatch output: Parse the ranked tables directly. Map importance 9-10 → MUST FIX, 7-8 → SHOULD FIX, 5-6 → CONSIDER, 1-4 → NIT.
PR comments: Fetch via gh api repos/{owner}/{repo}/pulls/{number}/comments. Categorize each: bug claim → bug type, style suggestion → design/complexity, question → skip (reply only).
Markdown document: Look for severity markers, tables, or heading-based grouping. Map to standard severities.

Severity Normalization

Reviewers use different severity vocabularies. Normalize all external labels to the four canonical levels before filtering:

External Label	Canonical Severity
CRITICAL, P0, Blocker, MUST FIX	MUST FIX
HIGH, P1, Major, SHOULD FIX	SHOULD FIX
MEDIUM, P2, Moderate, CONSIDER	CONSIDER
LOW, P3, Minor, NIT, Trivial, Style, INFO	NIT

When a finding lacks an explicit severity label, infer from type:

Bug → default SHOULD FIX (unless clearly cosmetic)
Safety / Security → default MUST FIX
Performance → default SHOULD FIX
Documentation / Design / Complexity → default CONSIDER

Present the normalized table to the user for confirmation before proceeding.

Severity Filter (Default: MUST FIX + SHOULD FIX + CONSIDER)

After normalization, discard NIT-severity findings by default. Only MUST FIX, SHOULD FIX, and CONSIDER proceed to Phase 2.

In reviewer terms: MUST FIX and HIGH are always addressed. MEDIUM is addressed (maps to CONSIDER). LOW, Minor, NIT, Trivial, Style, and INFO are skipped — these are cosmetic or informational and dilute agent focus.

Discarded findings are listed in the summary report (Phase 6) with status "Skipped (below threshold)".

Override: Use --include=nit to opt in to NIT-severity findings when the user explicitly wants them addressed.

Phase 2: Create Beads Tasks

Create one bd create per finding. Each task description must be fully self-contained — a fresh agent can work it without reading the original review.

Task Description Template (All Types)

## Finding: {summary}

**Severity**: {severity}
**File(s)**: {file}:{line}
**Type**: {type}
**Flagged by**: {specialists}

### Problem
{detail — full finding including current behavior and why it matters}

### Resolution Steps
{type-specific steps — see below}

### Acceptance Criteria
- [ ] {specific, verifiable conditions}
- [ ] All existing tests pass: `cargo test`
- [ ] Code compiles clean: `cargo fmt --all && cargo check && cargo clippy --all-targets --all-features -- -D warnings`

Type-Specific Resolution Steps

Bug (TDD mandatory + test-consolidate):

1. Analyze existing tests using /test-consolidate principles BEFORE writing any test:
   a. Read the test module for the affected file
   b. Identify existing test clusters that cover the same function/method
   c. Determine the right test form for the new test:
      - If existing tests for this function use rstest: ADD A NEW #[case] to the
        existing parameterized test rather than writing a standalone #[test]
      - If existing tests use proptest: check if the bug reveals a property violation
        — if so, tighten the property assertion rather than adding a separate test
      - If existing tests use table-driven: add the bug's input/expected to the table
      - If no existing tests or tests are standalone: write a new test, but prefer
        rstest #[case] format if 3+ similar tests already exist for the same function
   d. NEVER duplicate test structure that already exists — extend, don't clone
2. Write the failing test in the form determined by step 1
   - Test name: descriptive of the behavior, NOT the review
   - Place in the appropriate test module for the file
3. Run test, confirm it FAILS: `cargo test <test_name> -- --nocapture`
4. Fix the production code — minimal change to pass the test
5. Run full suite: `cargo test`
6. Verify clean: `cargo fmt --all && cargo check && cargo clippy --all-targets --all-features -- -D warnings`

Performance:

1. Establish baseline: run relevant benchmark or add one if none exists
   - Use `/bench-compare` if Criterion benchmarks cover this path
2. Implement the optimization
3. Re-benchmark and compare against baseline
4. Verify no regressions: `cargo test`
5. Verify clean: `cargo fmt --all && cargo check && cargo clippy --all-targets --all-features -- -D warnings`

Safety (unsafe code):

1. Write a test exercising the unsafe path with edge-case inputs
2. Fix the safety issue (bounds checks, invariant enforcement, etc.)
3. Add or update `// SAFETY:` comment documenting invariants
4. Run tests: `cargo test`
5. If Miri-compatible: `cargo +nightly miri test <test_name>`
6. Verify clean: `cargo fmt --all && cargo check && cargo clippy --all-targets --all-features -- -D warnings`

Documentation:

1. Read the code the docs describe — understand actual behavior
2. Write or update documentation to match reality
3. Check AGENTS.md consistency table — if touched file is listed, update corresponding docs
4. Verify doc tests compile: `cargo test --doc`
5. Verify clean: `cargo fmt --all && cargo check && cargo clippy --all-targets --all-features -- -D warnings`

Design / Complexity (apply /reduce-complexity methodology):

1. Classify the target function(s) using the /reduce-complexity framework:
   a. Measure LOC (opening `{` to closing `}`, excluding blanks/comment-only lines),
      max nesting depth (with Rust match discount: exhaustive match on ≤6 variants is +0),
      and parameter count.
   b. Classify as ESSENTIAL (domain-inherent), ACCIDENTAL (reducible), or MIXED.
      Use the judgment tests: domain necessity, distinct error recovery per branch,
      sequential coupling, and accidental indicators (duplicated blocks, flattenable
      nesting, always-together params).
   c. If ESSENTIAL: STOP and report back. The finding may be misguided — complexity
      is inherent to the problem (sequential state machines, distinct error handling
      per NFS error class, etc.). Do NOT refactor just to satisfy a metric.
2. For ACCIDENTAL/MIXED, select AT MOST 3 techniques from the reduce-complexity
   catalog, preferring higher-confidence techniques first:
   - `auto-apply` (high confidence): guard clauses, redundant else removal, remove
     unnecessary Result, type aliases for repeated complex types
   - `suggest` (medium): pass by reference, `?` operator, merge match arms, let-else
   - `flag-for-review` (low): extract function, collapse if-chains, polymorphism,
     decompose state machine
3. Apply reduce-complexity safety checks BEFORE editing:
   - Unsafe exclusion zone: if the function contains `unsafe` or calls unsafe helpers
     in the same module, SKIP extract-function suggestions. Unsafe invariants
     (e.g. `set_len` + `truncate`, `#[repr(align)]`, FFI contracts) often span
     multiple statements and are invisible to refactoring.
   - Over-abstraction brake (for extract-function): if `param_count + return_fields
     >= body_lines / 3`, the extraction interface is nearly as complex as the body —
     flag for user review rather than extract.
   - Async boundary: extracting code containing `.await` may introduce `Send` bound
     violations on captured non-Send types — verify with `cargo check` immediately
     after extraction.
   - Clippy annotation respect: if the function has `#[allow(clippy::...)]`, the
     developer made a deliberate decision. Lower confidence by one level and report
     alongside the fix.
4. Refactor to address the finding while preserving behavior. Do NOT exceed 3
   techniques — more than that signals the function needs broader redesign, which
   should be filed as a separate task.
5. Run full test suite to confirm no regressions: `cargo test`
6. Verify clean: `cargo fmt --all && cargo check && cargo clippy --all-targets --all-features -- -D warnings`

Priority Mapping for `bd create`

Severity	bd priority
MUST FIX	1
SHOULD FIX	2
CONSIDER	3
NIT	4

Example

bd create --title="Fix off-by-one in window boundary check" --type=bug --priority=1

Then update the description with the full self-contained template using bd update <id> --description="...".

Phase 3: Concurrency Analysis

Determine which tasks can run in parallel vs. must run sequentially.

Step 1: Build File-Touch Map

For each task, list every file it will read or write:

Task	Writes	Reads
F1	crates/scanner-engine/src/engine/core.rs	crates/scanner-engine/src/engine/mod.rs
F2	crates/scanner-engine/src/engine/scratch.rs	-
F3	crates/scanner-engine/src/engine/core.rs	crates/scanner-engine/src/api.rs

Step 2: Identify Conflicts

Two tasks conflict if they write to the same file. Read-read and read-write of different files are fine.

Step 3: Form Parallel Groups

Within each severity wave, group non-conflicting tasks:

Wave 1 (MUST FIX):
  Group A (parallel): F1, F4  — no file overlap
  Group B (sequential after A): F3  — conflicts with F1 on core.rs

Wave 2 (SHOULD FIX):
  Group C (parallel): F5, F6, F7  — no file overlap

Step 4: Register Dependencies

bd dep add <F3-id> <F1-id>   # F3 depends on F1 (same file)

Phase 4: Execute in Priority Waves

Execute findings wave by wave: MUST FIX → SHOULD FIX → CONSIDER. NIT and INFO findings are skipped by default (see Phase 1 severity filter).

Within Each Wave

Dispatch parallel agents for each non-conflicting group using the Task tool with subagent_type=general-purpose. Each agent gets the full self-contained task description from Phase 2.

Agent prompt structure:

You are fixing a code review finding. Follow the resolution steps exactly.

{full task description from Phase 2}

IMPORTANT:
- Follow the resolution steps in order
- For bugs: BEFORE writing any test, read the existing test module and apply
  /test-consolidate principles:
  * If the function already has rstest parameterized tests, add a #[case] — do NOT
    create a new standalone test function
  * If the function has proptest coverage, tighten the property or add a targeted
    prop_assert — do NOT duplicate with a unit test
  * If similar tests exist as a table-driven loop, add your case to the table
  * Only create a new standalone #[test] when no consolidation opportunity exists
  * NEVER create a test that duplicates coverage already provided by existing tests
- For bugs: write the failing test BEFORE fixing code
- Run all verification commands listed in acceptance criteria
- Report back: what you changed, test results, any issues encountered

Collect results from all agents in the group.
Sequential groups: After a parallel group completes, dispatch the next group that depended on it.
Close completed tasks: bd close <id> for each successfully resolved finding.

Quality Gate Between Waves

Before moving to the next wave, run:

cargo fmt --all && cargo check && cargo clippy --all-targets --all-features -- -D warnings
cargo test

If anything fails, fix it before proceeding. Do not let failures from Wave 1 propagate into Wave 2.

Handling `--plan-only` Mode

If invoked with --plan-only, stop after Phase 3. Present the task list, dependency graph, and execution plan without dispatching agents. The user can then:

Reorder or remove tasks
Adjust groupings
Execute manually or re-invoke without --plan-only

Phase 5: Doc Verification

After all waves pass the quality gate, dispatch a separate /doc-verify agent on every source file modified during execution. This catches documentation drift introduced by the fixes.

Step 1: Collect Modified Files

Gather the list of all .rs files that were modified across all waves. Use git diff --name-only (unstaged) to identify them. Filter to .rs files only — skip test-only files and Cargo.toml.

Step 2: Dispatch Doc-Verify Agent

Launch a fresh Task agent with subagent_type="general-purpose". The agent prompt must:

Read every modified source file in full
Read adjacent module files for cross-reference context
Follow the /doc-verify Phase 2 verification protocol:
- Extract all testable claims from doc comments
- Verify each claim against the actual (now-modified) code
- Classify findings as BLOCK / WARN / INFO
Produce the standard doc-verify report

You are a documentation verifier. You have zero prior context about why these files
were changed — verify only what the documentation says against what the code does.

Files to verify:
{list of modified .rs files}

Follow the /doc-verify Phase 2 protocol exactly:
- Step A: Extract all testable claims from doc comments
- Step B: Verify code-level claims against actual implementation
- Step C: Identify external claims
- Step D: Verify external claims (skip with --code-only if time-constrained)
- Step E: Produce findings report

Output the standard doc-verify report format with BLOCK/WARN/INFO findings.

Step 3: Handle Findings

No BLOCKs: Proceed to summary report.
BLOCKs found: Present findings to the user. Doc BLOCKs indicate the fixes introduced documentation inaccuracies (e.g., a doc comment now describes old behavior). These should be fixed before considering the review execution complete.
- If the BLOCK is a trivial doc update (stale count, renamed parameter), fix it inline.
- If the BLOCK requires judgment (rewriting a behavioral description), flag it for the user.

When to Skip

If --skip-doc-verify flag is set, skip this phase entirely.
If no modified files contain doc comments (verified via quick Grep for doc-comment prefixes), skip with a note in the summary.

Phase 6: Complexity Regression Check

After Phase 5 doc-verification, run a targeted /reduce-complexity pass on every source file modified during execution. Fixes — especially defensive ones for bug and safety findings — commonly add guard code, error handling branches, and edge case paths. Left unchecked, these accumulate into accidental complexity that later reviewers will flag again.

This phase catches complexity regressions introduced by the fixes, NOT pre-existing complexity in the files. Only new or significantly worsened hotspots matter here.

Step 1: Build Before/After Comparison

For each modified .rs file, compute the per-function complexity metrics (LOC, max nesting, parameter count) for:

Before: the file at the merge base or the commit prior to execution
After: the file after all waves completed (current working tree)

Functions whose metrics crossed a threshold boundary (e.g. Advisory → Moderate, Moderate → High) or whose LOC grew by ≥30% are flagged as regressions.

Use git to fetch the before state:

git show HEAD:<file-path>  # or git show <merge-base>:<file-path>

Skip this step entirely if no .rs files under src/ were modified (e.g., only test files, configs, or docs changed).

Step 2: Dispatch Reduce-Complexity Agent

Launch a Task agent with subagent_type="general-purpose". The agent prompt:

You are checking for complexity regressions introduced by code review fixes.

Follow the /reduce-complexity methodology Phases 1-4 exactly:
  Phase 1: Detection — measure LOC, nesting, params for every function in the
           modified files (after state)
  Phase 2: Classification — ESSENTIAL / ACCIDENTAL / MIXED for each flagged function
  Phase 3: Suggestions — AT MOST 3 techniques per function
  Phase 4: Safety checks — unsafe exclusion zone, over-abstraction brake,
           async boundary warnings, clippy annotation respect

Scope: ONLY report complexity regressions — functions that newly crossed a threshold
OR grew by ≥30% LOC. Use git to compare against HEAD or the merge base:

  git show <before-ref>:<file-path>   # before state for comparison

Pre-existing complexity in these files is OUT OF SCOPE for this check. Do not flag
hotspots that were already present before the fixes.

Files to analyze (after-state in working tree):
{list of modified .rs files}

For each regression, report:
- Function name and location (file:line)
- Before/after metrics (LOC, nesting, params)
- Classification (ESSENTIAL / ACCIDENTAL / MIXED)
- Which finding introduced the complexity (cross-reference Phase 2 task IDs)
- Top technique to address it, with over-abstraction and safety check results

Output a concise ranked table sorted by severity (newly-Critical > newly-High >
newly-Moderate).

Step 3: Handle Regressions

No regressions: Note in summary and proceed.
Accidental regressions found: Present findings to the user. The fixes introduced reducible complexity — simple ones (guard clauses, redundant else, let-else on new happy paths added for bug fixes) can be applied inline as a follow-up pass. Broader regressions (extract function, state machine decomposition) should become new beads tasks for a subsequent session rather than bolted onto this execution.
Essential regressions only: Report them in the summary but do not act. Essential complexity introduced by fixes is usually correct — a bug fix that adds a distinct error recovery branch is a legitimate complexity increase.

When to Skip

If --skip-complexity-check flag is set.
If no .rs files under src/ were modified.
If the execution was entirely documentation findings (Phase 2 type=documentation for all tasks).

Phase 7: Summary Report

After all waves complete, present:

## Review Findings Execution Summary

**Source**: {source description}
**Total findings**: N
**Executed**: X resolved, Y skipped, Z failed

### Results

| # | Finding | Severity | Status | Task ID | Notes |
|---|---------|----------|--------|---------|-------|
| F1 | Off-by-one in window check | MUST FIX | Resolved | beads-xxx | TDD: test added + fix |
| F2 | Missing capacity hint | SHOULD FIX | Resolved | beads-yyy | 12% alloc reduction |
| F3 | Unclear doc comment | CONSIDER | Resolved | beads-zzz | Updated doc |
| F4 | Rename variable | NIT | Skipped | - | User opted out |

### Verification

- All tests passing: yes/no
- Clippy clean: yes/no
- New tests added: N
- Tests consolidated (extended existing rstest/proptest): N
- Files modified: [list]

### Doc Verification (Phase 5)

- Files verified: N
- Claims checked: N
- BLOCKs: N (list if any)
- WARNs: N
- Verdict: PASS / PASS WITH WARNINGS / FAIL

### Complexity Regression Check (Phase 6)

- Files scanned: N
- Regressions found: N (newly-Critical / newly-High / newly-Moderate)
- Classification breakdown: X ESSENTIAL / Y ACCIDENTAL / Z MIXED
- Applied inline: N (guard clauses, redundant else, let-else)
- Filed as new beads tasks: N (extract function, state machine decomposition)
- Verdict: PASS / PASS WITH REGRESSIONS (list top 3)

Anti-Patterns

Anti-Pattern	Why It's Wrong	Do This Instead
Fixing a bug without a failing test first	You don't know if the fix works or if the bug was real	TDD: failing test → fix → green
Putting multiple findings in one task	Agents lose focus, partial completion is messy	One `bd create` per finding
Skipping documentation findings	Doc debt compounds silently	Documentation findings are never optional
Dispatching agents that write to the same file	Merge conflicts and lost work	Concurrency analysis in Phase 3
Task description says "see review for details"	Fresh agent can't work it — context is lost	Self-contained descriptions with full context
Running all severities in parallel	A MUST FIX might invalidate a NIT	Execute in priority waves
Skipping the quality gate between waves	Broken state cascades into subsequent fixes	`cargo test` + clippy between every wave
Writing a new standalone test when rstest cases exist	Test proliferation, maintenance burden, inconsistent patterns	Add a `#[case]` to the existing rstest instead
Ignoring existing proptest coverage for a bug	Duplicates coverage, misses the property violation	Tighten the property assertion or add a targeted `prop_assert`
Skipping doc-verify after fixes	Fixes can invalidate doc comments, creating silent drift	Always run Phase 5 doc-verify on modified files
Refactoring an ESSENTIAL complexity finding to satisfy a metric	Breaks incident-response narrative; fabricates "simpler" code by hiding real complexity	Classify first; reject the finding if complexity is inherent
Applying >3 reduce-complexity techniques to a single function in one pass	Signals broader redesign is needed — incremental fixes will not land it	File a new beads task for redesign; apply at most 3 techniques here
Extracting code containing `.await` without checking `Send` bounds	`cargo check` fails mid-wave; dependent fixes stall	Run `cargo check` immediately after any async extraction
Extracting a function out of an `unsafe`-invariant span	Silently breaks invariants; Miri will not catch all cases	Unsafe exclusion zone — skip extract-function entirely for these
Skipping Phase 6 complexity regression check on fix-heavy waves	Defensive fixes accumulate into nested guard code that later reviewers re-flag	Always run Phase 6 when `.rs` files under `src/` were modified

Configuration

Flag	Effect
`--plan-only`	Stop after Phase 3 — show tasks and execution plan, don't execute
`--wave=N`	Execute only wave N (1=MUST FIX, 2=SHOULD FIX, 3=CONSIDER)
`--include=nit`	Include NIT findings (skipped by default)
`--include=nit,info`	Include both NIT and INFO findings
`--skip=consider`	Also skip CONSIDER findings (only MUST FIX and SHOULD FIX)
`--dry-run`	Parse and normalize findings without creating beads tasks
`--skip-doc-verify`	Skip Phase 5 doc verification
`--skip-complexity-check`	Skip Phase 6 complexity regression check

Related Skills

/review-dispatch — produces the findings this skill consumes
/pr-comment-response — TDD verify-first pattern for individual PR comments
/bench-compare — baseline/comparison benchmarks for performance findings
/test-strategy — choose appropriate test type (unit, property, fuzz, Kani)
/test-consolidate — referenced in bug resolution steps to avoid test duplication
/doc-verify — runs as Phase 5 to catch documentation drift from fixes
/reduce-complexity — drives the Design/Complexity resolution steps (Phase 2) and the Phase 6 regression check; use it to classify ESSENTIAL vs ACCIDENTAL complexity and to apply safety-annotated reduction techniques

name	execute-review-findings
description	Use when you have code review findings, PR comments, or review reports that need to be systematically addressed — especially when there are multiple findings across different files and severities

execute-review-findings

More from this repository

More from this repository

Execute Review Findings

When to Use

When NOT to Use

Invocation

Phase 1: Gather & Normalize

Severity Normalization

Severity Filter (Default: MUST FIX + SHOULD FIX + CONSIDER)

Phase 2: Create Beads Tasks

Task Description Template (All Types)

Type-Specific Resolution Steps

Priority Mapping for bd create

Example

Phase 3: Concurrency Analysis

Step 1: Build File-Touch Map

Step 2: Identify Conflicts

Step 3: Form Parallel Groups

Step 4: Register Dependencies

Phase 4: Execute in Priority Waves

Within Each Wave

Quality Gate Between Waves

Handling --plan-only Mode

Phase 5: Doc Verification

Step 1: Collect Modified Files

Step 2: Dispatch Doc-Verify Agent

Step 3: Handle Findings

When to Skip

Phase 6: Complexity Regression Check

Step 1: Build Before/After Comparison

Step 2: Dispatch Reduce-Complexity Agent

Step 3: Handle Regressions

When to Skip

Phase 7: Summary Report

Anti-Patterns

Configuration

Related Skills

Execute Review Findings

When to Use

When NOT to Use

Invocation

Phase 1: Gather & Normalize

Severity Normalization

Severity Filter (Default: MUST FIX + SHOULD FIX + CONSIDER)

Phase 2: Create Beads Tasks

Task Description Template (All Types)

Type-Specific Resolution Steps

Priority Mapping for bd create

Example

Phase 3: Concurrency Analysis

Step 1: Build File-Touch Map

Step 2: Identify Conflicts

Step 3: Form Parallel Groups

Step 4: Register Dependencies

Phase 4: Execute in Priority Waves

Within Each Wave

Quality Gate Between Waves

Handling --plan-only Mode

Phase 5: Doc Verification

Step 1: Collect Modified Files

Step 2: Dispatch Doc-Verify Agent

Step 3: Handle Findings

When to Skip

Phase 6: Complexity Regression Check

Step 1: Build Before/After Comparison

Step 2: Dispatch Reduce-Complexity Agent

Step 3: Handle Regressions

When to Skip

Phase 7: Summary Report

Anti-Patterns

Configuration

Related Skills

Priority Mapping for `bd create`

Handling `--plan-only` Mode

Priority Mapping for `bd create`

Handling `--plan-only` Mode