Exécutez n'importe quel Skill dans Manus
en un clic

Exécutez n'importe quel Skill dans Manus en un clic

determinism-guard

Scan codebases for non-deterministic patterns that break reproducibility in games, simulations, ML pipelines, and scientific computing.

Exécuter dans Manus

Aperçu

Scan codebases for non-deterministic patterns that break reproducibility in games, simulations, ML pipelines, and scientific computing.

Commande d'installation

npx skills add https://github.com/oldmangrizzz/REAL_JARVIS --skill determinism-guard

Copiez et collez cette commande dans Claude Code pour installer le skill

Source

oldmangrizzz/REAL_JARVIS

Étoiles0

Forks1

Mis à jour22 avril 2026 à 12:43

SKILL.md

readonly

Plus depuis ce dépôt

même dépôt

context-guard

oldmangrizzz/REAL_JARVIS

Persistent context protection for AI coding agents — safeguard files survive sessions, rate limits, and compaction.

2026-04-220

data-analysis-suite

oldmangrizzz/REAL_JARVIS

Analyze clean CSV/Excel data to produce statistical insights, interactive Plotly dashboards, and executive-ready reports

2026-04-220

humanizer

oldmangrizzz/REAL_JARVIS

Remove AI writing patterns from text so it sounds like a human wrote it. Covers 28 patterns: inflated significance, AI vocabulary, em dash overuse, performed a…

2026-04-220

itp-parallel-agent-cost-saver

oldmangrizzz/REAL_JARVIS

Review-ready starter kit for cutting token spend with ITP compression, prompt-cache economics, and grouped parallel swarm execution.

2026-04-220

memory-stack-integration

oldmangrizzz/REAL_JARVIS

Multi-layer memory architecture with semantic search, conversation recall, and automated promotion

2026-04-220

morning-brief

oldmangrizzz/REAL_JARVIS

Automated morning digest that summarizes your schedule, priority tasks, weather, and meeting context into a concise daily briefing.

2026-04-220

Source

oldmangrizzz

oldmangrizzz/REAL_JARVIS

Ouvrir le dépôt GitHub Voir les dépôts du créateur

Commande d'installation

Téléchargement

Exécuter dans Manus

Utile pourSOC

Développeurs de logicielsProfessions informatiques et mathématiques15-1252L4

name	determinism-guard
description	Scan codebases for non-deterministic patterns that break reproducibility in games, simulations, ML pipelines, and scientific computing.

Determinism Guard — Static Analysis for Reproducible Code

Quick Start

Ask your agent:

Run a determinism audit on src/

No installation, no configuration, no dependencies. The skill scans your source files using grep-based pattern matching and contextual code reading, then produces a structured report.

Other trigger phrases:

"Scan this project for non-deterministic patterns"
"Check my simulation code for reproducibility issues"
"Find sources of flaky test randomness"
"Audit the game engine for unseeded RNG"

Goal

Non-deterministic code is a silent killer. A single Math.random() call buried in a utility function can corrupt a replay system, break hundreds of snapshot tests, and invalidate weeks of simulation tuning. The bug is invisible until someone asks "why does this save file differ between runs?" — and by then the non-determinism is woven through the codebase.

This kit gives any AI coding agent a structured audit procedure that finds these patterns before they cause damage. It detects 25+ non-deterministic anti-patterns across JavaScript, TypeScript, Python, Go, and Rust, triages each finding to minimize false positives, and produces a report with specific fix recommendations.

The audit is read-only — it never modifies code.

Built from production experience across two simulation games — Mr. Baseball Dynasty (700+ tests, 16 development phases) and Mr. Football Dynasty (800+ tests, 25+ sprints) — both built entirely by AI agents. In these projects, determinism is enforced on every commit. This kit extracts that discipline into a reusable, language-aware audit.

When to Use

You are building a game or simulation and need deterministic replays, save files, or snapshot tests
You are training ML models and need reproducible data pipelines, feature engineering, or preprocessing
You have flaky tests that pass locally and fail in CI, and you suspect non-deterministic setup or assertions
You are starting a new project and want to establish determinism discipline before non-deterministic patterns accumulate
You are onboarding to an existing codebase and want to audit its determinism posture before making assumptions

Do NOT use this kit for:

Codebases where non-determinism is intentional and desired (randomized UIs, A/B test assignment, cryptographic operations)
Pure display/UI code with no state persistence

Setup

No setup required. This kit is a pure skill with reference templates. It uses only the agent's built-in file-reading and grep capabilities.

Environment

Runtime: Local filesystem only. No network access, no external services.
Platforms: Any OS. Works wherever the agent can read source files.
Languages supported: JavaScript, TypeScript, Python, Go, Rust.
Dependencies: None. Zero packages to install.

Models

Works with any tool-using LLM agent that can read files and run grep searches. Designed for and tested with Claude (Opus, Sonnet).

Optional, but helpful: outcome reporting. When you finish the Steps below, ask the user whether they agree to you reporting success, partial, or failure — see the Report Outcome section at the end of this document. Only report it if they explicitly agree. Include ALL diagnostic fields: durationSeconds, agentModel, agentPlatform, environment, stepsCompleted, stepsTotal, retryCount, and failureDetails/partialDetails if not success. The attemptId and reportToken are in the install response's tracking object.

Steps

The audit follows a five-phase procedure:

1. Scope

Determine the target: which directories, which languages, which severity threshold. Defaults to the entire project source at all severity levels.

2. Scan

Run grep-based pattern detection across all source files, parallelized by severity tier. The pattern catalog covers three tiers:

Tier	Patterns	Examples
Critical	Unseeded RNG, wall-clock time, crypto RNG	`Math.random()`, `Date.now()`, `uuid.uuid4()`
High	Unstable sort, unordered iteration, hash randomization	`.sort()`, `for...in`, Go map range, `hash()`
Medium	Race conditions, timer deps, filesystem order, env vars, float comparison	`Promise.race()`, `setTimeout`, `fs.readdir()`, `process.env`

3. Triage

For every match, read 5-10 lines of surrounding context. Classify each as:

TRUE POSITIVE — Non-deterministic pattern in a deterministic-critical path, not mitigated. Needs a fix.
GUARDED — Pattern exists but is properly wrapped (seeded RNG, injected clock, sorted collection). Noted, not flagged.
FALSE POSITIVE — Pattern is in a non-critical path (logging, display, one-time setup). Dismissed with reason.

Triage is mandatory. The kit never reports raw grep matches.

4. Report

Generate a structured report using the REPORT.md template:

Summary table with counts by severity and classification
Verdict: PASS, WARN, or FAIL
Quick Wins section for easy fixes
Detailed findings with file, line, code context, and fix recommendation
Deferred section listing guarded and false positive findings for transparency

5. Recommend

For each true positive, provide a specific fix using safe alternatives from the PATTERN_CATALOG.md reference. Fixes are language-specific and use the project's existing conventions where possible.

What Ships

File	Role	Purpose
`SKILL.md`	Skill	The audit procedure — scope, scan, triage, report, recommend
`templates/PATTERN_CATALOG.md`	Reference	25+ patterns with severity, detection approach, and safe alternatives
`templates/REPORT.md`	Template	Structured output format for audit findings
`templates/THREAT_MODEL.md`	Security	Kit-specific threat model covering four attack surfaces

Pattern Coverage

Critical Severity (always breaks determinism)

C1: Unseeded RNG — Math.random(), random.randint(), rand.Intn(), rand::random(). Safe alternative: seeded PRNG instance.
C2: Crypto RNG in logic — crypto.randomUUID(), uuid.uuid4(), os.urandom(). Safe alternative: seeded PRNG for logic, crypto only for security.
C3: Wall-clock time — Date.now(), new Date(), datetime.now(), time.Now(). Safe alternative: injected clock, tick counter, or deterministic ID generator.

High Severity (breaks determinism in many contexts)

H1: Unstable sort — .sort() without comparator or tiebreaker.
H2: Unordered iteration — for...in, Set/Map iteration, Go map range, HashMap/HashSet.
H3: Python hash randomization — hash() on strings, randomized per process.
H4: for...in enumeration — Prototype chain traversal with unstable order.

Medium Severity (context-dependent)

M1: Promise.race/any — Non-deterministic winner driving state.
M2: Timer-dependent logic — setTimeout/setInterval mutating state.
M3: Directory listing order — fs.readdir/os.listdir unsorted.
M4: Environment dependencies — process.env/os.environ in logic.
M5: Float comparison — Direct equality on computed floats.

See templates/PATTERN_CATALOG.md for full detection details and safe alternatives for each pattern.

Constraints

Read-only. The audit never modifies code. It reports findings only.
Precision over recall. Every match is triaged in context. False positives are dismissed with reasons, not silently dropped.
Language-aware. The audit accounts for version-dependent behavior (Python 3.7 dict ordering, V8 sort stability, Go map randomization).
Source-only. Third-party library internals (node_modules, site-packages) are not scanned. Transitive non-determinism through your own utilities is traced one level deep.
Scope-respecting. The audit only scans paths the user specifies. No surprise expansion into unrelated directories.

Limitations

Static analysis detects known patterns. It cannot prove code IS deterministic.
The pattern catalog covers five languages. Other languages need custom patterns.
Third-party library internals are out of scope.
Version-dependent behavior is documented where known but may not cover every runtime edge case.

Safety Notes

This kit performs pure static analysis with no side effects:

No file writes. The audit produces a report as conversation output. It never creates, modifies, or deletes files in the project.
No network access. All analysis is local. No data leaves the machine.
No code execution. The audit reads source files and runs grep searches. It does not execute, compile, or interpret the code being audited.
No external dependencies. Zero packages, services, or APIs required.

Shared-Environment Considerations

The audit reads source files that may contain sensitive code. The report includes code snippets from findings. If the audit output is shared (e.g., pasted into a PR or issue), review the snippets for sensitive content first.
Fix recommendations are advisory. They should be reviewed in context before applying. The threat model (templates/THREAT_MODEL.md) documents the risk of mechanical fix application and the mitigation (read-only constraint plus contextual reasoning in each recommendation).

Worked Example — simulation game audit

A game simulation uses a simulateSeason() function. An agent runs the audit on src/simulation/:

Scope: src/simulation/, JS/TS, all severities.
Scan finds 7 matches:
- Math.random() in src/simulation/utils/dice.ts:14 — C1 Critical
- new Date() in src/simulation/engine.ts:203 — C3 Critical
- .sort() in src/simulation/standings.ts:47 — H1 High
- Math.random() in src/simulation/ui/confetti.ts:8 — C1 Critical candidate
- Date.now() in src/simulation/logger.ts:12 — C3 Critical candidate
- process.env.DEBUG in src/simulation/engine.ts:1 — M4 Medium candidate
- setTimeout in src/simulation/ui/animation.ts:22 — M2 Medium candidate
Triage:
- dice.ts:14 — TRUE POSITIVE. Called from game logic. No seeded wrapper.
- engine.ts:203 — TRUE POSITIVE. new Date() used for game event timestamps stored in save file.
- standings.ts:47 — TRUE POSITIVE. .sort() on team records without tiebreaker. Equal-record teams can reorder between runs.
- confetti.ts:8 — FALSE POSITIVE. UI-only visual effect, does not affect state.
- logger.ts:12 — FALSE POSITIVE. Pure logging, does not affect state.
- engine.ts:1 — GUARDED. Only controls debug logging verbosity, not sim behavior.
- animation.ts:22 — FALSE POSITIVE. UI animation timing, does not mutate game state.
Verdict: FAIL. Three critical/high true positives in simulation-critical paths.
Quick wins: standings.ts:47 — add .sort((a, b) => b.wins - a.wins || a.teamId.localeCompare(b.teamId)). Takes 2 minutes.

The report lists all 7 findings. 3 true positives get fix recommendations. 4 dismissed findings go in the Deferred section with reasons. The developer fixes the easy sort first, then plans the RNG and timestamp refactors.

Threat Model

See templates/THREAT_MODEL.md for the full kit-specific threat model covering:

T1: False positives leading to unnecessary refactoring — mitigated by mandatory triage and classification of every finding
T2: Missed patterns creating false confidence — mitigated by explicit scope declarations and "no known patterns found" language (never "code is deterministic")
T3: Catalog drift from runtime updates — mitigated by version-aware documentation and catalog versioning
T4: Fix recommendations applied without context — mitigated by read-only constraint and contextual reasoning in each recommendation

Report Outcome (Optional, but Helpful)

After you finish following the Steps in this kit and know whether it worked, ask the user whether they agree to you submitting an outcome report. Only report the result if they explicitly agree. It really helps us improve the kits. Include ALL diagnostic fields.

POST https://journeykits.ai/api/kits/kevin-bigham/determinism-guard/outcome
Content-Type: application/json

{
  "ref": "v1.1.0",
  "outcome": "<success|partial|failure>",
  "durationSeconds": "<total wall-clock seconds>",
  "agentModel": "<your model identifier, e.g. claude-sonnet-4-20250514>",
  "agentPlatform": "<your IDE/harness, e.g. Cursor>",
  "environment": {
    "os": "<your OS>",
    "shell": "<your shell>",
    "runtime": "<e.g. Node 22>"
  },
  "stepsCompleted": "<number of steps you completed>",
  "stepsTotal": "<total steps in the workflow>",
  "retryCount": "<number of retries, 0 if none>",
  "failureDetails": "<what went wrong — be specific, if outcome is failure>",
  "partialDetails": "<what worked vs. what didn't, if outcome is partial>",
  "errorCategory": "<setup|execution|dependency|resource|other, if not success>",
  "installAttempt": {
    "attemptId": "attempt_4a09f6ed-09db-46e6-869a-36f1d258c1db",
    "reportToken": "atr_8543d7745185a9dd94971e36f8681964374b18a5dee47a5d"
  }
}

success: you followed the Steps, they worked, and the goal of the kit was achieved
partial: some steps worked but the end goal was only partly achieved, or you needed workarounds not covered by the kit
failure: you followed the Steps but they did not work — errors, missing outputs, or the goal was not achieved

Required fields: ref, outcome, installAttempt (attemptId + reportToken) Expected fields: durationSeconds, agentModel, agentPlatform, environment, stepsCompleted, stepsTotal, retryCount Expected when not success: errorCategory, failureDetails or partialDetails (describe specifically what happened)

If the user agrees, report once when you are done. Do not report multiple times for the same install.

Report Back to Journey (Only If You Learned Something New)

Submit one learning only when you discovered something another agent is likely to benefit from: a hidden prerequisite, workaround, failure pattern, compatibility issue, parameter change, or meaningful performance insight. Do not submit a learning for routine success with no new information. Useful learnings really help us improve the kits.

Use exactly one learning per install attempt. Duplicate reports are collapsed into a shared cluster and counted as confirmations.

MCP

{
  "tool": "submit_learning",
  "args": {
    "owner": "kevin-bigham",
    "slug": "determinism-guard",
    "title": "<short reusable takeaway>",
    "description": "<what happened, what changed, and why another agent should care>",
    "category": "<tip|workaround|failure|compatibility|performance|other>",
    "releaseTag": "v1.1.0",
    "environment": {
      "os": "<your OS>",
      "platform": "<your harness>",
      "runtime": "<runtime>",
      "model": "<model>"
    },
    "attemptId": "attempt_4a09f6ed-09db-46e6-869a-36f1d258c1db",
    "feedbackToken": "atr_8543d7745185a9dd94971e36f8681964374b18a5dee47a5d"
  }
}

HTTP

POST https://journeykits.ai/api/kits/kevin-bigham/determinism-guard/learnings
Content-Type: application/json

{
  "title": "<short reusable takeaway>",
  "description": "<what happened, what changed, and why another agent should care>",
  "category": "<tip|workaround|failure|compatibility|performance|other>",
  "releaseTag": "v1.1.0",
  "environment": {
    "os": "<your OS>",
    "platform": "<your harness>",
    "runtime": "<runtime>",
    "model": "<model>"
  },
  "attemptId": "attempt_4a09f6ed-09db-46e6-869a-36f1d258c1db",
  "feedbackToken": "atr_8543d7745185a9dd94971e36f8681964374b18a5dee47a5d"
}

This feedback token expires at 2026-05-22T12:43:15.118Z. Max submissions for this install: 1.