一键导入
consensus-safety-invariants
// L1 trigger - detects non-determinism, state transition completeness violations, and safety/liveness invariant breaks in consensus code. Inject into depth-consensus-invariant or depth-state-trace.
// L1 trigger - detects non-determinism, state transition completeness violations, and safety/liveness invariant breaks in consensus code. Inject into depth-consensus-invariant or depth-state-trace.
| name | consensus-safety-invariants |
| description | L1 trigger - detects non-determinism, state transition completeness violations, and safety/liveness invariant breaks in consensus code. Inject into depth-consensus-invariant or depth-state-trace. |
L1 trigger:
L1_PATTERN=trueAND (consensus/ORstate_transition/ORfork_choice/ORbeacon_chain/detected in recon subsystem map) Inject Into:depth-consensus-invariant(primary) ordepth-state-trace(fallback) Language: Go and Rust (language-specific examples embedded) Finding prefix:[CS-N](consensus-safety) Status: v0.1 draft, Round 4 exemplars pending
Map sections to the assigned depth agent:
Recon sets L1_PATTERN=true and identifies a consensus subsystem in the target. This skill attaches to agents auditing that subsystem. It is the single most load-bearing L1 skill — consensus bugs are Critical tier by default.
A consensus implementation is non-deterministic if two honest nodes processing the same input can reach different output states. Each of the following is a production-bug class:
In Go, for k, v := range m visits keys in an unspecified order. In Rust, HashMap iteration order varies between runs (non-deterministic by default; BTreeMap is ordered).
Check: For every consensus-critical function, enumerate every map iteration. For each:
sort.Strings(keys) in Go; BTreeMap or explicit .sort() in Rust) before iteration?Known exemplar class: Cosmos SDK Dragonberry/Elderflower (2022) involved IAVL tree iteration that became load-bearing for proof generation.
Tag: [NON-DET:map-iter:{file}:{line} → {consumer}]
time.Now(), std::time::SystemTime::now(), system clocks, file mtimes — any value that differs across nodes at the same block height.
Check: Grep/ast-grep for time sources. For each hit, trace forward: does the value enter consensus state?
Tag: [NON-DET:wall-clock:{source} → {consumer}]
Rule: floating-point arithmetic in consensus-reachable code paths is a CRITICAL-default finding. There are no portable, toolchain-stable, and SIMD-safe FP operations in production node-client code. "But we compiled with strict FP" is not a defense — the risk surface is compiler version, LLVM backend, target triple, feature flags, and the standard library implementation of transcendental functions, any of which can change across releases.
Known exemplars:
Polkadot 2021-05-24 toolchain-driven divergence — a consensus path used a Rust stdlib routine whose implementation differed between stable Rust 1.51 and nightly (same source, different generated code after an stdlib change). Two honest validators, both running the same source, produced different state because one had upgraded toolchain. Fix was twofold: ban FP in consensus code AND migrate affected logic to deterministic fixed-point. This is the canonical evidence that "identical source compiles identically" is false on FP paths.
Cosmos SDK #7773 / #15381 — FP in fee / reward math produced
validator-to-validator divergence in testnet replays; the fix
(#15381) explicitly
removed FP from the state machine and added lints to prevent
regressions. SDK defensive_programming guidance now forbids FP in
state-machine code, not just "discourages."
Parity Ethereum #6511 — early Parity discussion documenting why the Yellow Paper has no FP opcodes: structural ban, not stylistic. Any FP introduced by a precompile or host function wrapping a native FP op recreates the same risk.
EVM Yellow Paper — the bytecode has ZERO FP opcodes by design. Protocols that introduce FP via precompiles, host functions, or a custom VM extend the risk surface of their client beyond the spec.
Check:
float64, f64, f32, FloatingPointValue, and native
transcendentals (.sin(), .exp(), .ln(), .pow()) in every
consensus-reachable module.#![allow(clippy::float_arithmetic)] or the absence of
#![forbid(clippy::float_arithmetic)] at the crate root for Rust
consensus crates. The absence of forbid is itself a finding for any
crate that participates in state transitions.Severity: Consensus-reachable FP is CRITICAL-default. Non-consensus
FP (telemetry, RPC display) is Informational. Absent forbid lint on a
consensus crate is Low (defense-in-depth).
Tag: [NON-DET:float:{op}:{file}:{line}] — upgrade to
[NON-DET:float:CRITICAL:{op}:{file}:{line}] when the value reaches
state root / hash input / signed payload.
Channels, select statements with multiple ready cases, goroutine scheduling — any of these can produce order-dependent output.
Check: Ast-grep for select { case ... } with multiple readable cases inside consensus code. For each, verify the branch taken does not affect state.
Tag: [NON-DET:goroutine-order:{file}:{line}]
JSON/protobuf unmarshal with fields in different orders; strconv of NaN/Inf; locale-dependent string operations.
For every state-modifying function in the consensus path:
find-references or SCIP query)A is updated, is its paired variable (A_checkpoint, A_prev, A_accumulator) also updated?return err, is state rolled back? Or do some writes persist?| Function | State vars touched | All paths consistent? | Rollback on error? | Gap |
|---|---|---|---|---|
applyBlock | {list from LSP} | {yes/no} | {yes/no} | {specific gap} |
Tag: [STATE-GAP:{func}:{path} → {vars-updated} / {vars-missing}]
Safety: "nothing bad ever happens." For every documented invariant (from recon design_context.md or protocol spec):
∀ state s, predicate P(s) = truefind-references)Tag: [SAFETY:{invariant}:{break-path}]
A block header is a struct with N fields, every one of which flows into consensus. The validator must check EVERY field against an INDEPENDENT ground truth — not against itself, not against another producer-supplied field in the same block.
Methodology:
struct BlockHeader, struct IrysBlockHeader, etc.)prevalidate_block / validate_header / equivalentCommon gaps: previous_solution_hash, block.height, last_diff_timestamp, epoch_index, vdf_seed, reward_amount are all frequent victims. Validators check block.timestamp but forget the rest.
Tag: [HEADER-FIELD-UNVERIFIED:{field-name}]
Every signed transaction format must carry a per-sender replay-protection field. EVM uses nonce. Non-EVM tx formats (Cosmos sequence, Solana recent_blockhash, Irys anchor) use varied schemes.
Methodology:
valid_for_block_height AND parent_blockhash are validated against LOCAL parent, not against producer-supplied parentTag: [REPLAY-MISSING:{tx-type}], [REPLAY-CIRCULAR:{tx-type}]
For every consensus-relevant ID field (block_hash, tx_id, commitment_id) that is derived from a hash of the signed object, the verifier must independently RECOMPUTE the hash and compare against the producer-supplied ID. Just verifying the signature is not enough — the ID can be set to any value while the signature stays valid.
Methodology:
crates/types/ or equivalent: sign_block, sign_tx, sign_commitment_txid = keccak256(signature.as_bytes()) or hash(payload))validate_block / validate_txrecomputed_id == provided_idblock.block_hash = arbitrary_value while keeping the signature valid. Two valid blocks with the same hash, or one valid block with a fake hash field → consensus split.Tag: [ID-NOT-VERIFIED:{type}.{id-field}]
§3a tells you what to do; §3d makes it mechanically verifiable. For every block-header / transaction / commitment struct in the audit scope, produce an explicit per-field checklist BEFORE writing any finding. This prevents the "I checked timestamp and stopped" failure mode where the agent reports one validated field and forgets the other 14.
Required artifact: when this skill triggers, write {SCRATCHPAD}/validation_bundle_{struct_name}.md:
# Validation Bundle: IrysBlockHeader (50 fields)
| # | Field | Type | Validator function | Source of truth | Status |
|---|---|---|---|---|---|
| 1 | block_hash | H256 | validate_block_hash() at consensus/src/block_validator.rs:L42 | recomputed from payload | OK |
| 2 | timestamp | u64 | check_timestamp() at L78 | local clock ± 30s drift | OK |
| 3 | previous_solution_hash | H256 | — | — | **MISSING** |
| 4 | block.height | u64 | implicit (parent.height + 1) | parent block | **CIRCULAR — uses producer-supplied parent ref** |
| 5 | last_diff_timestamp | u64 | — | — | **MISSING** |
| 6 | difficulty | U256 | check_difficulty() at L156 | adjust_difficulty(parent.difficulty, time_delta) | OK |
| ... | ... | ... | ... | ... | ... |
Rules:
{SCRATCHPAD}/scip/repo_map.md to extract the full field list from the struct definition. Do NOT enumerate from memory. (MCP tools are unavailable in subagent contexts per Claude Code bug #25200.)validate_block / prevalidate_block / validate_header / equivalent for ANY reference to the field name. Record the function and line.OK (validated against independent ground truth), CIRCULAR (validated against another producer-supplied field), MISSING (no validator reference at all), BOUNDED (validator exists but bound is too loose).MISSING row generates a [HEADER-FIELD-UNVERIFIED:{field-name}] finding. Every CIRCULAR row generates a [HEADER-FIELD-CIRCULAR:{field-name}] finding. The bundle is the evidence — the orchestrator can verify the agent enumerated EVERY field by counting checklist rows against the struct field count.Why mandatory: Run 7 Core/Thorough on Irys produced finding "block.timestamp not validated against local clock" but missed previous_solution_hash, block.height, and last_diff_timestamp — the §3a methodology was present but didn't force enumeration. §3d closes this by making the checklist artifact the precondition for any header-validation finding. No checklist → no header-validation finding accepted.
Tag: [BUNDLE-INCOMPLETE:{struct_name}:{field_count_validated}/{total_fields}]
Liveness: "something good eventually happens." Plamen cannot prove liveness mechanically, but can identify liveness-blocking patterns:
for / while / loop in consensus code that can iterate > block-gas-limit times. Tag: [LIVENESS:unbounded-loop:{loc}][LIVENESS:lock-over-io:{loc}][LIVENESS:deadlock-self:{loc}][LIVENESS:halt-2-3:{loc}]For computationally expensive verifier routines (VDF step replay, ZK proof verification, BLS aggregation, recursive hash chains), the verifier must check cheap preconditions BEFORE entering the expensive compute loop. Otherwise an attacker forces the validator to pay full compute cost on a doomed-to-fail input, then learn it was invalid.
Methodology:
validate_* / verify_* function with a clear cheap-vs-expensive split, identify the loop body costvdf_steps.len() <= MAX_STEPS_PER_BLOCK)step_number < local_step + MAX_LOOKAHEAD)vdf_steps = [crafted; N] where N is huge but the result is wrong. Verifier loops N times then reports invalid — N times wasted CPU.Tag: [VERIFIER-LATE-REJECT:{func}]
When the protocol uses VDF, PoW, recursive hash chains, or similar proof-of-computation primitives, ask:
A deterministic seed chain or an unbounded claimed step count is a consensus and DoS finding, not just a performance concern.
Tag: [POX-ENTROPY:{func}], [POX-STEP-COUNT:{func}]
For all difficulty adjustment, fee calculation, and EMA computations using fixed-point arithmetic (u64/u128/U256 with implicit decimals), audit the order of operations: multiplication must precede division to avoid intermediate truncation.
Common bug: a / b * c truncates to integer at the division step, losing precision. The correct form is (a * c) / b, with an overflow check on the intermediate product.
Methodology:
adjust_difficulty, compute_*_fee, *_ema, interpolate, decay_factorx / y * z or (x / y) * z pattern as a finding unless y divides x exactly by constructionmulDiv style helpersTag: [FIXED-POINT-DIV-FIRST:{func}]
Apply the four-state sweep per consensus-critical function:
| State | Values tested | Expected behavior | Observed | Result |
|---|---|---|---|---|
| Zero | all params = 0 / empty / None | {from spec} | {from code} | |
| One | single validator / block / vote | {from spec} | {from code} | |
| Max | N_MAX validators / max balance / max slot | {from spec} | {from code} | |
| Byzantine boundary | 1/3 − 1, 1/3, 1/3 + 1, 2/3 − 1, 2/3, 2/3 + 1 Byzantine | {from spec} | {from code} |
Tag: [BOUNDARY:{state}:{var}={value} → {result}]
If the target is a fork of an existing consensus client (op-geth, op-reth, custom cometbft), use the fork-ancestry primitive:
git diff upstream/main...HEAD -- consensus/ (or equivalent subsystem path)Tag: [DRIFT:{func}:{upstream-behavior} → {fork-behavior}]
Findings use the standard Plamen format with these L1-specific fields:
[NON-DET-PASS] (run 2x, diff output) > [CONFORMANCE-PASS] (spec test) > [DIFF-PASS] (fork differential) > [LSP-TRACE] > [CODE-TRACE]Aptos deterministic→non-deterministic map refactor (October 18, 2023) — a performance change in aptos-core replaced a deterministic map with Rust HashMap in VM output handling. When a transaction hit its gas limit, the FeeStatement event's I/O gas summation iterated the map in arbitrary order, producing different gas-used totals on different validators. Chain halted for ~5 hours. Aptos incident report; technical detail. Skill catch point: Section 1a — enumerate map iteration in any function reachable from state-touching paths.
Cosmos SDK Go map iteration non-determinism (ongoing class, EPIC #13039) — Go's runtime intentionally randomizes range m for map[K]V. State-machine code that iterates maps produces divergent results across validators. Long-running class; multiple modules affected. EPIC tracker; writeup. Skill catch point: Same as Aptos — Section 1a.
Geth dataCopy shallow-copy + RETURNDATACOPY consensus split (CVE-2020-26241, discovered by Fluffy OSDI '21) — the dataCopy precompile (0x04) performed a shallow copy. Attacker writes X to memory, calls 0x04, overwrites memory with Y, calls RETURNDATACOPY. Geth returned Y; spec-compliant clients returned X. Consensus divergence. Fixed v1.9.17. NVD; Fluffy paper. Skill catch point: Section 6 (Cross-client consistency) + execution-client-hardening Section 4 (precompiles).
Geth transfer-after-destruct consensus divergence (CVE-2020-26265, Fluffy OSDI '21) — second Fluffy finding. Transfer semantics to an already-destructed contract diverged between Geth and OpenEthereum. Caused a mainnet hard fork event 4 months after disclosure. Fluffy paper. Skill catch point: Section 2 — state transition completeness for contract lifecycle (create → live → destruct → resurrect via CREATE2).
Solana stake inflation u64 overflow (September 2022 outage) — during the coordinated restart after the duplicate-block fork, the inflation mechanism generated enough new SOL to overflow a u64; stake-percentage math × 100 produced values exceeding max possible. Contributed to extended downtime. Helius outage history. Skill catch point: Section 3 (safety invariants) — total_supply ≤ u64::MAX / safety_margin invariant.
Cosmos-SDK x/group div-by-zero chain halt (ASA-2025-003, v0.50.11) — division by zero when total voting power is zero in a group proposal path, unrecovered panic halts chain. GHSA-x5vx-95h7-rv4p. This is the BeginBlocker/EndBlocker panic class — see methodology nuance below.
Add to Section 2 checks: unlike transaction-processing paths, code executed in BeginBlocker / EndBlocker / vote-extension-processor / PreBlocker does NOT recover from panics — any panic halts the chain. Cosmos SDK has 4+ advisories of exactly this pattern (ASA-2025-003, ISA-2025-002, ISA-2025-005, plus adjacent). Check:
panic(), recover() boundary, and unchecked slice index in BeginBlocker / EndBlocker / PreBlocker / vote-extension paths (grep the module for BeginBlock(, EndBlock(, PreBlock()for _, v := range stateSlice where stateSlice could be empty at genesis or after migrationx.(T) without the ok formTag: [PANIC-BLOCK:{func}:{trigger}]
If LSP/SCIP/ast-grep unavailable, degrade to:
range m (Go) or .iter() on HashMap (Rust)time.Now, SystemTimefloat64, f64git log --oneline consensus/)Note degraded scope in the finding with [PRIMITIVE:FALLBACK].
fork-choice-audit, cross-environment-semantic-drift, go-concurrency-safety, rust-unsafe-auditdepth-consensus-invariant, depth-state-tracedocs/l1-mode/severity-matrix.mdLaunch the Plamen deterministic L1 infrastructure audit pipeline
Run the Plamen L1 infrastructure audit wizard in Codex
Launch the Plamen deterministic Web3 security audit pipeline
Run the Plamen smart-contract audit wizard in Codex
L1 trigger - audits BLS signature aggregation: subgroup check, rogue-key attack defense, aggregation order, signing-domain separation.
L1 trigger - audits configuration constants, documented bounds, feature-gated values, and unused protocol limits for semantic drift.