mit einem Klick
false-positive-reduction
// Hybrid FP-reduction — joern when present, LLM fallback when absent. Six-stage rubric (Stage 0 + Stages 1-5) applied to every finding; emits the disposition register.
// Hybrid FP-reduction — joern when present, LLM fallback when absent. Six-stage rubric (Stage 0 + Stages 1-5) applied to every finding; emits the disposition register.
Conventions, anti-patterns, and meta-patterns for writing skills (and the shared agent/skill philosophy). Use when creating or editing a SKILL.md file, or when reviewing the agent-vs-skill separation. For the procedural workflow that generates a new agent file, use the agent-create skill (invoked by /agent-add).
Create new Claude Code sub-agent files following the official schema and token-efficiency budgets. Handles both review agents (JSON output, read-only tools, ≤ 40-line body) and team agents (prose output, action tools, ≤ 75-line body). Use when the user says "add an agent", "create a reviewer for X", "new team agent for Y", or when /agent-add is invoked. Validates against /agent-audit before writing. Updates the agent registry and CLAUDE.md after success.
Detect business logic reimplemented in multiple architectural layers. Builds a persistent computation-register.json by annotating non-trivial computation functions with structured semantic descriptions, then clusters entries to surface duplicate domain concepts. Runs in full-scan mode on first use, incremental (git-diff-based) mode on subsequent runs. Use when the user wants to find logical duplication that linters and diff-scoped review agents miss — the same domain calculation independently reimplemented across layers.
Pattern-table mapping from unified findings to regulatory citations (PCI-DSS, GDPR, HIPAA, SOC2). LLM edge annotator invoked only for llm_review_trigger=true rows.
Declarative phase graph for /security-assessment. Phases run in fixed order with dependency enforcement; per-phase artifacts land in memory/ and feed the next phase.
Collaborative workflow for producing the four specification artifacts (intent, BDD scenarios, architecture notes, acceptance criteria) before any implementation begins. Use when starting any new feature or behavior change — do not write code until artifacts pass the consistency gate.
| name | false-positive-reduction |
| description | Hybrid FP-reduction — joern when present, LLM fallback when absent. Six-stage rubric (Stage 0 + Stages 1-5) applied to every finding; emits the disposition register. |
| role | worker |
| user-invocable | false |
| version | 1.0.0 |
| maintainers | ["bdfinst","unassigned"] |
| required-primitives-contract | ^1.0.0 |
Transform a stream of unified findings into a disposition register that the exec-report-generator can trust. Every finding gets a verdict (true_positive | likely_true_positive | uncertain | likely_false_positive | false_positive), a reachability trace, an exploitability score, and a reachability_source tag (joern-cpg or llm-fallback).
The skill's job is to remove noise without suppressing real issues. False positives waste analyst attention; missed true positives get someone fired.
Lifted from the opus_repo_scan_test reference's § analyze-11 framework with extensions for the disposition-register output format. Stage 0 is new: a self-adversarial pre-pass that sharpens Stage 1 and strengthens the audit trail.
Question: What is the strongest argument that this finding is NOT a vulnerability?
The agent generates a counter-argument before applying the rubric. This is not a skip gate — all five subsequent stages still run. The purpose is twofold:
true_positive that explicitly refuted a counter-argument is more trustworthy than one that never examined the counter-case. A well-reasoned false_positive is more trustworthy than a silent discard.Counter-argument prompts:
entry_points don't include?Disposition rules:
da_strong: true; Stage 1 tests the hypothesisda_strong: false; Stage 1 performs open-ended reachability searchda_strong: true + Stage 1 confirms (unreachable) → false_positive; both arguments cited in rationaleda_strong: true + Stage 1 disproves (reachable) → rejected counter-argument cited in true_positive rationaleQuestion: Is this code executed in production at all?
Disposition rules:
verdict: false_positive, severity → info presentational.verdict: likely_false_positive, severity → one level down (CRITICAL → HIGH, HIGH → MEDIUM).true_positive.reachability.rationale.Joern-present mode: reachability is computed from the CPG by tracing back from the finding location to HTTP/CLI/lambda/cron entry points.
Joern-absent mode: the agent reasons from RECON's entry_points and security_surface fields, plus grep over the call sites. Tag each entry with reachability_source: llm-fallback.
Question: Could deployed configuration override the committed value, making the finding inert?
Disposition rules:
values.yaml or Helm chart overrides a committed default) → one level down, verdict: likely_true_positive (the committed value is still a weak default).The agent consults docker-compose*.yml, values.yaml, helmfile.yaml, k8s/*.yaml, and any CI-scoped env vars discoverable in .github/workflows/* or GitLab equivalents.
Question: Is there a control in the repo that mitigates this finding's impact?
Disposition rules:
likely_true_positive with the control's file:line in the rationale.Question: Is this the same root cause as another finding already in the register?
Disposition rules:
locations array; emit one disposition entry referencing the primary finding.semgrep.python.hardcoded-password + gitleaks.generic.aws-access-key firing on the same line) → dedupe keeping the higher-priority source per the static-analysis skill's priority order.Question: Is the severity consistent across similar findings?
Disposition rules:
Per-finding score determines presentational severity bucket (see primitives contract § Severity mapping). Factors:
| Factor | Weight | Example |
|---|---|---|
| Network reachability | +3 | Finding is in an HTTP handler on a public route |
| Authentication bypass | +3 | Finding bypasses an auth check (not merely missing one) |
| Credential exposure | +2 | Finding leaks a credential an attacker could use elsewhere |
| Input-controlled | +2 | An attacker can influence the vulnerable value via request parameters |
| Persistent | +1 | Finding creates persistent state (stored XSS, stored credentials) |
| Privileged context | +1 | Finding runs in an elevated context (root, admin route) |
| Cascading | +1 | A successful exploit unlocks further access (lateral movement) |
Rationale field is mandatory (min 20 chars per schema). Summarize which factors applied and why.
If joern is on PATH, invoke via tools/reachability.sh (build commands + CPG cache details are in the script). Stage 1 reachability queries the CPG for paths from the finding location back to entry points; cite the entry point path in reachability.rationale.
Stages 1–3 use judgment rather than CPG data; Stages 4–5 work unchanged. Every entry in fallback mode carries reachability_source: llm-fallback. The exec-report-generator detects this and emits a banner — see agents/exec-report-generator.md § Section 0 banners.
A DispositionRegister object per plugins/agentic-dev-team/knowledge/schemas/disposition-register-v1.json. Required entry fields and required envelope fields (schema_version, generated_at, dispositioner, reachability_tool, entries[]) are defined in the schema. Written to memory/disposition-<assessment-slug>.json.
agents/fp-reduction.md — the opus agent that implements this skillplugins/agentic-dev-team/knowledge/security-primitives-contract.md — disposition register schema + severity mappingplugins/agentic-dev-team/knowledge/schemas/disposition-register-v1.json — JSON Schematools/reachability.sh — joern wrapper (installed if joern is on PATH)