Run any Skill in Manus with one click

$pwd:

resource-leak-audit

Name: Resource Leak Audit
Author: NethermindEth

// Audit C#/.NET code for resource leaks, IDisposable misuse, CTS lifecycle bugs, unbounded growth, handle exhaustion, and dispose race conditions. Use when asked to "audit for leaks", "check for resource leaks", "find dispose issues", "review for memory leaks", or when reviewing a PR that touches resource management, disposal, CancellationTokenSource, ArrayPool, event subscriptions, or async coordination patterns.

Run Skill in Manus

$ git log --oneline --stat

stars:1,561

forks:706

updated:March 27, 2026 at 13:25

File Explorer

2 files

SKILL.md

readonly

related-skills.json

same repository

gas-benchmark.md

from "NethermindEth/nethermind"

Build a diag Docker image, run gas-benchmarks repricing workflow, and analyze results including dotTrace XML reports. Use when asked to "run benchmarks", "trigger gas benchmarks", "benchmark this branch", "profile block processing", or "analyze benchmark run". Supports analyze-only mode for CI integration.

2026-05-201.6k

fix-nethtest.md

from "NethermindEth/nethermind"

Debug and fix failing Ethereum Foundation tests run with Nethermind.Test.Runner. Accepts test file paths as arguments. Runs the test, analyzes traces, identifies root causes in the EVM/spec/test harness, and proposes fixes.

2026-03-251.6k

review.md

from "NethermindEth/nethermind"

Deep code review for an Ethereum execution client. Checks consensus correctness, security, robustness, performance, DI patterns, breaking changes, and observability. Use when asked to "review", "check this PR", "look for bugs", "audit", or "review my changes".

2026-03-131.6k

package.json

"author": "NethermindEth"

"repository": "NethermindEth/nethermind"

View GitHub Repository View Creator Repositories

$ install --global

$ download --local

Run Skill in Manus

$ useful --forSOC

Software Quality Assurance Analysts and TestersComputer and Mathematical Occupations15-1253L4

name

resource-leak-audit

description

Audit C#/.NET code for resource leaks, IDisposable misuse, CTS lifecycle bugs, unbounded growth, handle exhaustion, and dispose race conditions. Use when asked to "audit for leaks", "check for resource leaks", "find dispose issues", "review for memory leaks", or when reviewing a PR that touches resource management, disposal, CancellationTokenSource, ArrayPool, event subscriptions, or async coordination patterns.

Resource Leak Audit

Overview

Systematic audit for resource leaks in C#/.NET code. Covers IDisposable/IAsyncDisposable misuse, unbounded collection growth, pinned memory, unmanaged allocations, event handler GC roots, abandoned tasks, closure captures, static accumulation, finalizer queue pressure, thread-local storage, TCS lifecycle, dispose races, and constructor exception leaks.

Mode Selection

Determine the audit scope before starting:

PR Mode (default when reviewing a PR)

Audit only files changed in the PR. Use git diff to get the changed files, then apply the full methodology to those files plus their immediate callers/callees.

Steps:

Get changed files: git diff origin/master...HEAD --name-only (three-dot diff uses the merge-base, so it works correctly both locally and in CI even if master has advanced)
Filter to non-test C# files (exclude *.Test*, *.Benchmark*)
For each changed file, also read classes it inherits from and interfaces it implements
Apply Phase 1 search only for categories relevant to the changed code
Apply Phase 2 validation for any CRITICAL/HIGH findings
Skip Step 0a (git history mining) unless a finding needs prior-fix context

Full Audit Mode (when asked to audit the codebase)

Audit all non-test code in src/Nethermind/. This is the exhaustive mode — use all steps below including Step 0.

Before Starting

Read these — they define conventions and inform what counts as a leak:

All rule files in .agents/rules/ — always list the directory rather than relying on a fixed list
CONTRIBUTING.md and .editorconfig

Methodology — Two-Phase Audit with Reviewer Gate

Step 0: Study Prior Fixes and Codebase Patterns (Full Audit Mode)

0a. Mine prior leak-fix commits:

git log --oneline --grep="dispose" --grep="leak" --all-match -30
git log --oneline --grep="memory leak" -20
git log --oneline --grep="IDisposable" -20
git log --oneline --grep="ArrayPool" -20
git log --oneline --grep="buffer leak" -10
git log --oneline --grep="double dispose" -10
git log --oneline --grep="race condition" --grep="dispose" --all-match -10

For the top 20 most relevant commits, read the diff (git show <hash>). For each: What was the leak pattern? What was the fix? Are there similar patterns elsewhere still unfixed? Do sibling classes have the same bug?

0b. Discover the codebase's resource management conventions:

Before searching for leaks, learn what safe patterns exist. Leaks often come from NOT using an available safe wrapper. For each resource type, answer:

Pool buffer management: Disposable wrappers around ArrayPool.Rent/Return? Search: struct/class types in **/Buffers/*, **/Collections/* implementing IDisposable referencing ArrayPool.
Thread-safe disposal: What pattern for _disposed guards? Search: Interlocked near _disposed. Then find bool _disposed fields not following it.
CTS lifecycle: Helper combining Cancel() + Dispose() atomically? Search: methods taking ref CancellationTokenSource calling both.
Buffer ownership: Marker interfaces for owned resources (IOwned*, IMemoryOwner)? Disposal extension methods?
Test infrastructure for leak detection: Tracking/counting pool wrappers in test projects?

Record findings — used in Phase 1 backward searches.

Parallelization

If subagents are available:

Step 0: Run 0a and 0b in parallel
Phase 1: One subagent PER category. Each exhausts ONE category completely
Convergence checkpoint: After first wave, review findings for cross-cutting patterns, launch second wave
Phase 2: Validation subagents for CRITICAL/HIGH candidates
Final review: Separate reviewer subagent with complete findings for fresh perspective

Phase 1: Exhaustive Search (Breadth-First)

The most common failure mode is spending all time validating a few findings while missing dozens of others. Phase 1 separates search from validation.

For each category (see @references/pattern-categories.md):

Search exhaustively using three complementary strategies:
- Forward search (construction -> disposal): Grep for resource construction. For each hit, follow forward to verify cleanup exists.
- Backward search (safe pattern bypass): Using Step 0b wrappers, search for code using the raw pattern instead of the available safe alternative. Every safe pattern implies a class of bugs from not using it.
- Disposal-site audit (Dispose -> completeness): For each IDisposable class in the subsystem, read its Dispose method and compare against constructor and field declarations.
- Search for constructors, factory methods, and assignments — not just type names.
Check EVERY match — not a sample. Report "N total matches, M confirmed findings, (N-M) verified clean". If 100+, split across subagents.
For every candidate, read surrounding code (method, class's Dispose, callers if disposable is returned):
- Is there a using statement/declaration?
- Is there a try/finally with cleanup?
- If it's a field, does the owning class clean it up on shutdown/disposal?
- If ownership transfers, does the receiver clean it up?
- On error/exception paths, is the resource still cleaned up?
Impact assessment (mandatory for every candidate before recording). Before assigning severity, answer:
- What accumulates or degrades? Name the concrete resource: bytes, kernel handles, native allocations, file descriptors, pool slots, registrations. Read what the cleanup method actually releases — if it releases nothing for this specific instance, state that.
- How much, how fast? Quantify: bytes per occurrence, occurrences per hour/day. If the answer is "once at shutdown" or "zero bytes," severity cannot be above COSMETIC.
- Is this path actually hit? Check input domain bounds (e.g., size thresholds that gate allocation paths), config defaults, whether the code is reachable from any caller.
- Does the runtime already handle this? Check DI container disposal, GC finalization, eviction/TTL mechanisms, bounded caches. If another mechanism already cleans this up, state which one and classify accordingly.
Record findings — file, line(s), one-sentence description, category, severity, frequency, impact assessment answers, ownership note, error-path note. Do NOT trace call graphs or search GitHub yet. Move fast.
Mandatory sibling expansion. After each confirmed finding: identify the structural pattern, search for every instance. Run git log --oneline -10 -- <file> to check for sibling fixes.
After exhausting a category, reflect: "What patterns suggest similar leaks I haven't searched for?"
Stop ONLY when all categories are covered AND reflection produces no new actionable patterns.

Phase 1 Convergence Checkpoint

After all categories return:

What patterns repeat across findings?
Do findings in one subsystem imply identical bugs in siblings?
Are there interfaces appearing as root cause?

The convergence step is where the best findings come from — a leak in class A often implies the same leak in B, C, D.

Phase 2: Validation

CRITICAL/HIGH findings — full deep validation is mandatory.

For each CRITICAL/HIGH candidate:

A. Triggerability proof: Trace complete call graph to entry points. For race conditions: prove two threads can reach the race point — name the threads, show how launched, identify exact interleaving. If caller is serial, downgrade. For error-path leaks: identify what exception triggers it. For resource exhaustion: calculate accumulation rate.

B. Adversary analysis: Can an external attacker trigger this remotely? What control needed? Amplification? Mark ADVERSARY-TRIGGERABLE if triggerable without node operator access.

C. Protocol context: What Ethereum/L2 protocol concept? Why does the code exist? Real-world impact — not "memory pressure" but specifically what happens.

D. Existing work check: gh search issues --repo <owner/repo> "<keyword>" and gh search prs --repo <owner/repo> "<keyword>". Check git log --oneline -10 -- <file>.

E. Triggerability classification: TESTABLE or CONFIDENT-LEAK. Add CORRUPTION-RISK / ADVERSARY-TRIGGERABLE as applicable.

F. Test strategy (TESTABLE only): Failing test -> passing test after fix. What injection needed? What assertion proves the leak?

MEDIUM/LOW findings — present Phase 1 findings with their impact assessments to the user and ask:

"Phase 1 produced N MEDIUM and M LOW findings with impact assessments but without deep validation (call graph tracing, existing-work checks, triggerability proofs). Some may be false positives. Would you like me to run deep validation on MEDIUM/LOW findings to filter those out? This will take additional time."

If the user declines, output MEDIUM/LOW findings clearly marked: "Phase 1 only — impact assessed but not deeply validated."

Self-Critique (Every Finding)

Before recording any finding at any severity, argue against yourself:

"Does anything actually accumulate or degrade?" — Read what the cleanup method releases for this specific instance. If the answer is "nothing" or "only a managed object GC already handles," it's COSMETIC.
"Is the caller actually concurrent?" — Prove it, don't assume from type signature.
"Is the trigger actually reachable?" — Check input domain bounds, size thresholds, config defaults. Dead code is not a leak.
"Does the leak actually accumulate?" — Is there eviction, TTL, GC, DI container disposal, or a finite bound?
"Am I confusing poor style with a real bug?" — If the quantified runtime impact is zero, classify as COSMETIC. State what best practice it violates and why the impact is zero. The user decides whether to fix cosmetic issues.

Final Review Pass

Severity calibration — consistent across all severities, not just CRITICAL/HIGH
Duplicate detection — merge same root cause from different angles
False positive sweep — if trigger can't be stated in one concrete sentence, downgrade
Missing context — every CRITICAL/HIGH needs all mandatory fields

Output Format

Phase 1 (all findings)

### Finding [N]: [Short title]
- **File**: `Nethermind.Project/Path/File.cs`
- **Line(s)**: [line numbers]
- **What leaks**: [precise description]
- **Category**: [from pattern list]
- **Severity**: CRITICAL | HIGH | MEDIUM | LOW | COSMETIC
- **Frequency**: [per-peer, per-block, per-request, per-shutdown, once, etc.]
- **Impact**: [quantified — bytes/handles per occurrence, accumulation rate, or "zero — [reason]"]
- **Ownership analysis**: [Who creates, who should clean up, why missing]
- **Error-path analysis**: [If applicable]
- **Fix complexity**: [SIMPLE (1-5 lines) | MEDIUM (10-30 lines) | HARD (refactor)]

COSMETIC means: technically violates a best practice but has zero quantified runtime impact. Include what practice it violates and why the impact is zero.

Phase 2 additions (CRITICAL/HIGH — append)

- **Protocol context**: [What protocol concept, why code exists]
- **Exact trigger**: [Specific event causing the leak]
- **Call graph**: [Entry point -> ... -> buggy method]
- **Adversary analysis**: [Can attacker trigger? Control needed? Amplification?]
- **Accumulation rate**: [Leaks/hour, bytes/object, when degradation occurs]
- **Real-world impact**: [Specific degradation on running node]
- **Trigger**: TESTABLE | CONFIDENT-LEAK [| CORRUPTION-RISK] [| ADVERSARY-TRIGGERABLE]
- **Trigger explanation**: [How test triggers / Why not deterministic]
- **GitHub context**: [Existing issues/PRs]
- **Prior PR context**: [Similar fixes with commit hash]
- **Test complexity**: [SIMPLE | MEDIUM | HARD]
- **Test strategy**: [Failing test description]

Final Output

Complete deduplicated findings list
Triage summary by reachability: Practically triggerable / Config-gated / Theoretically possible / Impossible in current architecture
Test plan summary — one-line per TESTABLE finding
Potentially missed — suspected patterns needing more investigation
Confidence assessment — coverage estimate per category
COSMETIC findings — separate section, with one sentence each explaining what practice is violated and why runtime impact is zero

Rules

Non-test code only — skip *.Test*, *.Benchmark*
Read actual code — don't report based on grep matches alone
Prove triggerability — "could theoretically race" is not enough
Config-gated is NOT dead — report with config dependency noted
Check GitHub before reporting — prevent duplicate work
Ownership transfers are not leaks — verify the receiver cleans up
GC finalization is not proper disposal — still report, but quantify the actual impact (finalizer queue pressure vs. zero impact)
using declarations (C# 8+) are valid disposal
Process-lifetime singletons: Check DI registration. If both the object and its event publishers are singletons with matching lifetimes, event unsubscription and disposal have zero runtime effect — classify as COSMETIC and state why.
Cancel() is NOT Dispose() — but impact depends on what the CTS holds internally. After finding a CTS that is cancelled but not disposed, check: was it created with a timeout constructor? Was CancelAfter() called? Was its token passed to Task.Delay() or CreateLinkedTokenSource()? These create internal Timer handles or parent-token registrations that only Dispose() releases. A plain new CancellationTokenSource() that was cancelled with none of the above holds no unmanaged resources — classify as COSMETIC and state why.
Interfaces severing disposal chains — before reporting, check if the DI container manages the concrete type's disposal. If the container tracks it, the interface gap has no runtime effect — classify as COSMETIC.
Override methods discarding parameters — derived class ignoring disposable = leak
Double-dispose is worse than no-dispose — corrupts shared state. But verify Dispose() is actually called from multiple threads before reporting a race — check all callers.
Empty Dispose() is a red flag — check constructor for resources
Codebase uses .NET 10, C# 14, Autofac for DI

resource-leak-audit

More from this repository

More from this repository

Resource Leak Audit

Overview

Mode Selection

PR Mode (default when reviewing a PR)

Full Audit Mode (when asked to audit the codebase)

Before Starting

Methodology — Two-Phase Audit with Reviewer Gate

Step 0: Study Prior Fixes and Codebase Patterns (Full Audit Mode)

Parallelization

Phase 1: Exhaustive Search (Breadth-First)

Phase 1 Convergence Checkpoint

Phase 2: Validation

Self-Critique (Every Finding)

Final Review Pass

Output Format

Phase 1 (all findings)

Phase 2 additions (CRITICAL/HIGH — append)

Final Output

Rules

Resource Leak Audit

Overview

Mode Selection

PR Mode (default when reviewing a PR)

Full Audit Mode (when asked to audit the codebase)

Before Starting

Methodology — Two-Phase Audit with Reviewer Gate

Step 0: Study Prior Fixes and Codebase Patterns (Full Audit Mode)

Parallelization

Phase 1: Exhaustive Search (Breadth-First)

Phase 1 Convergence Checkpoint

Phase 2: Validation

Self-Critique (Every Finding)

Final Review Pass

Output Format

Phase 1 (all findings)

Phase 2 additions (CRITICAL/HIGH — append)

Final Output

Rules