Jeden Skill in Manus ausführen
mit einem Klick

Jeden Skill in Manus mit einem Klick ausführen

$pwd:

mutation-testing

Name: Mutation Testing
Author: Roasbeef

// Validates Go test suite quality through mutation testing using go-gremlins/gremlins. Mutates production code, runs the test suite against each mutant, and reports which mutants the tests fail to kill — exposing weak assertions that line coverage cannot detect. Use when evaluating test effectiveness, validating newly written tests, or improving test quality for mission-critical code (consensus, channel state, payment flows, crypto). Triggers: "mutation test", "are these tests strong", "validate test quality", "/mutation-testing".

In Manus ausführen

$ git log --oneline --stat

stars:19

forks:3

updated:20. Mai 2026 um 00:30

Datei-Explorer

7 Dateien

SKILL.md

readonly

related-skills.json

gleiches Repository

lnget.md

from "Roasbeef/claude-files"

Use lnget to fetch resources from L402-protected URLs that require Lightning payments. Covers basic fetching, payment limits (max cost, max routing fee), token cache management, and Lightning backend status. Use when an HTTP request returns 402 Payment Required and a Lightning micropayment is needed, or when downloading files behind a Lightning paywall.

2026-05-2019

test-refine.md

from "Roasbeef/claude-files"

Refines an existing Go test suite — removes trivial/duplicate tests, strengthens weak assertions, reshapes tests around invariants, and closes branch-coverage gaps. Uses code-guided coverage and (when available) gremlins mutation-testing survivor data rather than relying on line coverage alone. Use when test quality is uneven, after a test-generation pass, before opening a PR, or as a quality gate on critical paths (consensus, channel state, payment flows). Triggers: "refine these tests", "tests are bloated", "tighten assertions", "remove trivial tests", "audit test quality", "/test-refine".

2026-05-2019

go-debug.md

from "Roasbeef/claude-files"

Interactively debug Go programs in a single context using Delve (dlv) driven through tmux. Use when a bug requires runtime inspection — stepping through code, examining variables, walking goroutines, attaching to a live process, or debugging a hanging integration test — rather than just reading the source. Triggers include "step through this", "set a breakpoint", "attach to the running server", "why is this goroutine stuck", "debug this failing test".

2026-05-2019

substrate.md

from "Roasbeef/claude-files"

This skill provides agent mail management via the Subtrate command center. Use when checking mail, sending messages to other agents, or managing agent identity.

2026-03-3019

lnd.md

from "Roasbeef/claude-files"

Run and interact with lnd Lightning Network daemon in Docker. Use for Lightning development, testing payment channels on regtest, managing lnd containers, and calling lnd RPC endpoints (getinfo, connect, open/close channels, pay/receive). Supports bitcoind, btcd, and neutrino backends.

2026-03-3019

agent-cli.md

from "Roasbeef/claude-files"

Design and review CLIs for AI agent consumption. Covers machine-readable output, input hardening against hallucinations, schema introspection, context window discipline, dry-run safety rails, and skill file packaging. Use when building new CLIs, adding agent support to existing CLIs, reviewing CLI designs for agent compatibility, or wrapping APIs as CLI tools. Triggers: agent CLI, CLI for agents, machine-readable CLI, agent-first CLI, CLI agent DX.

2026-03-3019

package.json

"author": "Roasbeef"

"repository": "Roasbeef/claude-files"

GitHub-Repository öffnen Creator-Repositorys ansehen

$ install --global

$ download --local

In Manus ausführen

$ useful --forSOC

Softwarequalitätssicherungsanalysten und -testerInformatik- und Mathematikberufe15-1253L4

name

mutation-testing

description

Validates Go test suite quality through mutation testing using go-gremlins/gremlins. Mutates production code, runs the test suite against each mutant, and reports which mutants the tests fail to kill — exposing weak assertions that line coverage cannot detect. Use when evaluating test effectiveness, validating newly written tests, or improving test quality for mission-critical code (consensus, channel state, payment flows, crypto). Triggers: "mutation test", "are these tests strong", "validate test quality", "/mutation-testing".

Mutation Testing

Mutation testing evaluates test quality by introducing small, deliberate bugs into production code (mutants) and checking whether the test suite fails. A test that passes on a mutant did not actually verify the behavior the mutant changed.

This skill is a thin orchestrator over go-gremlins/gremlins — a maintained Go mutation testing tool. The skill provides install, run, and analysis wrappers that produce machine-readable JSON for downstream tooling (notably the test-refine skill).

Why Mutation Testing

A test suite can hit 100% line coverage and still be useless: tests can execute code without asserting on its results, or assert only on side-irrelevant fields. Mutation testing closes this gap by checking whether the test suite distinguishes the original code from a mutant. See references/coverage-pitfalls.md (in the test-refine skill) for the broader context.

When to Use

After generating tests with test-forge or by hand — verify they have real assertions.
Before merging consensus / payment / crypto code — quality gate on critical paths.
During code review — surface weak tests in the diff.
As a signal source for test-refine — survivors map to weak-assertion findings.

Target efficacy (gremlins terminology: test_efficacy = killed / (killed + lived)):

Code class	Target
Mission-critical (consensus, wallet, channel, crypto)	90%+
Core business logic	80–90%
General code	70–80%
Trivial/glue code	run only if cheap

Workflow

1. Install gremlins (once)

~/.claude/skills/mutation-testing/scripts/install-gremlins.sh

The script pins to a known-good version (override with GREMLINS_VERSION=...). Requires go on PATH and $(go env GOPATH)/bin on PATH.

2. Run mutations

# Default: cwd, JSON to .reviews/mutations/<slug>.json
~/.claude/skills/mutation-testing/scripts/unleash.sh

# Targeted package
~/.claude/skills/mutation-testing/scripts/unleash.sh \
    --pkg ./internal/wallet \
    --output .reviews/mutations/wallet.json

# With integration tests and a config file
~/.claude/skills/mutation-testing/scripts/unleash.sh \
    --pkg ./internal/channel \
    --integration \
    --config .gremlins.yaml \
    --silent

3. Analyze survivors

~/.claude/skills/mutation-testing/scripts/analyze-survivors.sh \
    --input .reviews/mutations/wallet.json \
    --output .reviews/mutations/wallet.md

Produces a markdown report with: efficacy/coverage summary, survivors ranked by file (consensus/channel/wallet paths bubble to the top), and mutator-type breakdown.

Gremlins JSON Schema

gremlins unleash --output <file> emits a single JSON document:

{
  "go_module": "github.com/example/foo",
  "test_efficacy": 82.00,
  "mutations_coverage": 80.00,
  "mutants_total": 100,
  "mutants_killed": 82,
  "mutants_lived": 8,
  "mutants_not_viable": 2,
  "mutants_not_covered": 10,
  "elapsed_time": 123.456,
  "files": [
    {
      "file_name": "wallet.go",
      "mutations": [
        { "line": 42, "column": 8, "type": "CONDITIONALS_NEGATION", "status": "KILLED" }
      ]
    }
  ]
}

Mutation status values:

Status	Meaning	Action
`KILLED`	Test suite caught the mutation	Good — no action
`LIVED`	Tests passed despite mutation	Survivor — strengthen tests
`NOT COVERED`	Mutation in code no test exercises	Add a test for that path
`TIMED OUT`	Tests timed out — implicit kill	Investigate (might be perf bug)
`NOT VIABLE`	Mutation produced uncompilable code	Excluded from score
`RUNNABLE`	Dry-run only; would be tested	(only in `--dry-run`)

Key metrics:

test_efficacy = killed / (killed + lived) — quality of assertions on covered code.
mutations_coverage = (killed + lived) / (killed + lived + not_covered) — how much code is exercised at all.

A high mutations_coverage with low test_efficacy means tests run code without verifying its behavior — the classic "100% line coverage, 0% real testing" failure mode.

Configuration

Gremlins is configured via .gremlins.yaml (or --config <path>). Mutators ship default-on for safe operators and default-off for aggressive ones.

Default-on mutators (always enabled):

arithmetic-base — + - * / %
conditionals-boundary — < <= > >=
conditionals-negation — == !=, boolean conditions
increment-decrement — ++ --
invert-negatives — -x ↔ +x

Default-off mutators — enable for critical packages:

invert-assignments — += -= *= /= etc. swaps
invert-bitwise — & | ^ swaps
invert-bwassign — &= |= ^= swaps
invert-logical — && ↔ || (security-critical: catches auth bypass mutations)
invert-loopctrl — break ↔ continue
remove-self-assignments — drop x = x op y updates

Recommended config for consensus/wallet/payment code:

silent: false
unleash:
  workers: 0          # use all CPUs
  test-cpu: 0         # no per-test CPU pinning
  threshold:
    efficacy: 90      # fail if below 90%
    mutant-coverage: 85
mutants:
  arithmetic-base:        { enabled: true }
  conditionals-boundary:  { enabled: true }
  conditionals-negation:  { enabled: true }
  increment-decrement:    { enabled: true }
  invert-negatives:       { enabled: true }
  invert-assignments:     { enabled: true }
  invert-bitwise:         { enabled: true }
  invert-bwassign:        { enabled: true }
  invert-logical:         { enabled: true }   # critical for && / || in auth
  invert-loopctrl:        { enabled: true }
  remove-self-assignments:{ enabled: true }

See gremlins.dev configuration docs for the full schema.

Threshold Gating (CI)

For CI, use --silent and set thresholds in config or via env vars:

gremlins unleash --silent --output mutations.json ./...
# Exit nonzero if efficacy < threshold.

The unleash.threshold.efficacy and unleash.threshold.mutant-coverage keys cause gremlins to exit nonzero when the run falls below the configured percentages — wire this into your PR check.

Integration with Other Skills

`test-refine`

The test-refine skill consumes gremlins JSON to identify weak-assertion zones (smell S12: mutation-survivor). When invoked with --use-mutations, it calls unleash.sh and cross-references LIVED mutants with the AST smell scan.

`test-forge`

After test-forge generates tests, run mutation testing to validate them. LIVED mutants are direct evidence of weak assertions in the generated tests.

`code-review`

Include the test_efficacy delta in PR review — regression of >5% in covered code is a strong signal of weakening test quality.

Interpreting Results

High efficacy (≥90%): Tests have strong assertions. Focus remaining work on NOT COVERED mutants (uncovered code paths).

Medium (75–90%): Tests cover main paths. Survivors usually indicate boundary or error-path gaps.

Low (<75%): Significant gaps — tests likely run code without checking outputs. Pair with test-refine to identify the specific smells.

Mutator breakdown tells you the kind of weakness:

conditionals-boundary LIVED → missing edge tests at thresholds.
invert-logical LIVED → missing truth-table coverage for &&/||.
arithmetic-base LIVED → tests don't verify calculation results.
remove-self-assignments LIVED → state mutations not asserted.

Equivalent Mutants

Some LIVED mutants are semantically equivalent to the original — no test could kill them. Common cases:

Mutated value immediately overwritten before being read.
Mutation in unreachable code.
Operator swap in associative/commutative context with no observable difference.

When you identify an equivalent mutant, document it (e.g., a comment near the mutation site, or a project-level EQUIVALENT_MUTANTS.md) so reviewers don't waste time on it. Gremlins doesn't filter equivalents automatically.

Gremlins Limitations

From the upstream README: gremlins targets smallish Go modules (microservices). On very large modules, runs can take hours. Mitigations:

Per-package runs via --pkg ./internal/wallet. Don't pass ./... on a 500k-LOC monorepo.
Skip generated code by using build tags or running on hand-written packages only.
Use --workers to bound parallelism if memory is tight.
Use --dry-run first to preview the mutation count and skip if it's too large.

mutation-testing

Mutation Testing

Why Mutation Testing

When to Use

Workflow

1. Install gremlins (once)

2. Run mutations

3. Analyze survivors

Gremlins JSON Schema

Configuration

Threshold Gating (CI)

Integration with Other Skills

`test-refine`

`test-forge`

`code-review`

Interpreting Results

Equivalent Mutants

Gremlins Limitations

Further Reading

Mutation Testing

Why Mutation Testing

When to Use

Workflow

1. Install gremlins (once)

2. Run mutations

3. Analyze survivors

Gremlins JSON Schema

Configuration

Threshold Gating (CI)

Integration with Other Skills

`test-refine`

`test-forge`

`code-review`

Interpreting Results

Equivalent Mutants

Gremlins Limitations

Further Reading

mutation-testing

Mehr aus diesem Repository

Mutation Testing

Why Mutation Testing

When to Use

Workflow

1. Install gremlins (once)

2. Run mutations

3. Analyze survivors

Gremlins JSON Schema

Configuration

Threshold Gating (CI)

Integration with Other Skills

test-refine

test-forge

code-review

Interpreting Results

Equivalent Mutants

Gremlins Limitations

Further Reading

Mutation Testing

Why Mutation Testing

When to Use

Workflow

1. Install gremlins (once)

2. Run mutations

3. Analyze survivors

Gremlins JSON Schema

Configuration

Threshold Gating (CI)

Integration with Other Skills

test-refine

test-forge

code-review

Interpreting Results

Equivalent Mutants

Gremlins Limitations

Further Reading

Mehr aus diesem Repository

`test-refine`

`test-forge`

`code-review`

`test-refine`

`test-forge`

`code-review`