| name | cairo-testing |
| description | Cairo smart-contract testing with snforge. Trigger on "write tests", "add unit tests", "fuzz test", "integration test", "test this contract", "regression test". Guides test strategy, cheatcode usage, and coverage. |
| license | Apache-2.0 |
| metadata | {"author":"starknet-agentic","version":"0.2.0","org":"keep-starknet-strange","source":"starknet-agentic"} |
| keywords | ["cairo","testing","snforge","starknet-foundry","fuzzing","integration"] |
| allowed-tools | ["Bash","Read","Write","Glob","Grep","Task"] |
| user-invocable | true |
Cairo Testing
You are a Cairo testing assistant. Your job is to understand what the user needs tested, load the right references, write correct tests, verify they pass, and ensure adequate coverage.
When to Use
- Writing unit, integration, fuzz, or fork tests for Cairo contracts.
- Designing regression tests for known findings.
- Validating event emission, failure semantics, or access control.
- Improving test coverage on an existing contract.
When NOT to Use
- Contract architecture decisions (
cairo-contract-authoring).
- Performance tuning (
cairo-optimization).
- Deployment operations (
cairo-deploy).
- Security audit of existing code (
cairo-auditor).
Quick Start
- Classify what needs testing: new contract, specific function, regression, or coverage gap.
- Load references based on test type — see the table in Orchestration.
- Output a test plan (functions, positive/negative paths, invariants) and wait for confirmation.
- Implement tests following snforge patterns, then run
snforge test.
- Verify coverage: every external tested? auth paths? negative cases? events?
- Emit a handoff block using
../references/skill-handoff.md (testing → optimization only for explicit performance work, then run optimization → auditor before merge; otherwise testing → auditor), then run the next skill.
Rationalizations to Reject
- "We only need happy-path tests."
- "Access control tests are unnecessary — the contract handles it."
- "Fuzz tests are overkill for this contract."
- "We'll add regression tests after the next audit."
Mode Selection
- unit: Test individual functions using
contract_state_for_testing(). No deployment needed.
- integration: Test deployed contracts via dispatchers. Multi-contract interactions.
- fuzz: Property-based tests with
#[fuzzer] for arithmetic, bounds, invariants.
- fork: Test against live Starknet state with
#[fork].
- regression: Turn a known finding into a failing-before/fixed-after test pair.
Orchestration
Turn 1 — Understand. Classify the request:
(a) Determine mode: unit, integration, fuzz, fork, or regression.
(b) Read the contract under test. Use Glob to find .cairo files, then Read to inspect them. Identify:
- All storage-mutating
#[abi(embed_v0)] impl functions and #[external(v0)] functions (these must be tested).
- Storage fields and their types.
- Events that should be emitted.
- Access control patterns (owner checks, role checks).
(c) Check for existing tests. Use Glob to find tests/ directories and test files.
(d) Load references based on what's needed:
| Request involves | Load reference |
|---|
| Basic test structure, deployment, assertions | {skill_dir}/references/legacy-full.md (Basic Test Structure, Contract Deployment) |
| Cheatcodes (caller, timestamp, block number) | {skill_dir}/references/legacy-full.md (Cheatcodes section) |
| Event testing, spy_events | {skill_dir}/references/legacy-full.md (Event Testing section) |
| Fuzz / property tests | {skill_dir}/references/legacy-full.md (Fuzzing section) |
| Fork testing against mainnet | {skill_dir}/references/legacy-full.md (Fork Testing section) |
| Security regression recipes | ../cairo-auditor/references/vulnerability-db/ |
Where {skill_dir} is the directory containing this SKILL.md. Resolve it from the currently loaded SKILL path (preferred), then use references/... relative paths from that directory.
Turn 2 — Plan. Before writing any test code, output a brief plan:
- Functions to test — list each external function and whether it gets a positive test, negative test, or both.
- Access control tests — for each guarded function, test: authorized caller succeeds, unauthorized caller reverts.
- Event tests — list events that should be verified with
spy_events.
- Edge cases — zero values, max values, duplicate calls, reentrancy attempts.
- Fuzz targets — identify functions with numeric inputs that should get
#[fuzzer] tests.
- Regression tests — if fixing a known finding, describe the failing-before scenario.
Keep the plan under 30 lines. Wait for user confirmation before implementing.
Turn 3 — Implement. Write tests following these rules:
Structure rules:
- Use
#[cfg(test)] mod tests { ... } for unit tests in the same file, or separate tests/ directory for integration tests.
- Create a shared
helpers module for deploy_contract(), address constants (OWNER(), USER(), ZERO()).
- Name tests descriptively:
test_<function>_<scenario> (e.g., test_transfer_non_owner_rejected).
Coverage rules (mandatory):
- Every state-mutating external entrypoint MUST have a success-path test.
- Add negative-path tests for entrypoints with explicit authorization, validation, or other rejection conditions.
- Every access-controlled function MUST be tested with: (1) authorized caller succeeds, (2) unauthorized caller panics with expected message.
- Constructors MUST be tested for correct initial state, plus zero-address rejection when the constructor explicitly forbids zero addresses.
- Read-only externals (
self: @ContractState) MUST verify return correctness; revert tests are optional unless the function defines a failure path.
- Every event-emitting function MUST verify the event with
spy_events + assert_emitted.
Cheatcode rules:
- Use
start_cheat_caller_address / stop_cheat_caller_address to impersonate callers.
- Use
start_cheat_block_timestamp for timelock tests — never hardcode timestamps.
- Always call the matching
stop_cheat_* after assertions to avoid leaking state.
Fuzz rules:
- Use
#[fuzzer(runs: 256, seed: 12345)] with a fixed seed for reproducibility.
- Constrain inputs with guard clauses or bounded types, not by ignoring invalid inputs.
After writing tests, run snforge test to verify they pass. If any fail, fix and re-run.
Turn 4 — Verify. After tests pass:
(a) Coverage checklist — mentally walk through every external function:
- Has a success-path test?
- Has a failure-path test (wrong caller, bad input, overflow)?
- Emits correct events?
- Fuzz target if numeric inputs?
(b) Report any untested functions or missing edge cases to the user.
(c) If the user's project uses cairo-auditor, suggest running it to find additional test targets.
(d) Suggest next steps:
- "Run
cairo-auditor for a security review — it may surface additional test cases."
- "Consider adding fork tests if this contract interacts with deployed protocols."
Runnable Example (starknet.js)
import { Account, Contract, RpcProvider } from "starknet";
const provider = new RpcProvider({ nodeUrl: process.env.STARKNET_RPC! });
const account = new Account(provider, process.env.ACCOUNT_ADDRESS!, process.env.PRIVATE_KEY!);
const contract = new Contract(abi, process.env.CONTRACT_ADDRESS!, provider).connect(account);
const tx = await contract.invoke("transfer", [process.env.USER_ADDRESS!, 100, 0]);
await provider.waitForTransaction(tx.transaction_hash);
const balance = await contract.call("balance_of", [process.env.USER_ADDRESS!]);
console.log({ balance });
Error Codes and Recovery
| Code | Condition | Recovery |
|---|
TEST-001 | snforge test fails to compile | Fix imports/dependencies in Scarb.toml, then rerun full suite. |
TEST-002 | Access-control negative test missing | Add unauthorized caller case with expected panic message. |
TEST-003 | Event assertion mismatch | Use spy_events + assert_emitted with full event payload ordering. |
TEST-004 | Flaky fuzz results | Pin #[fuzzer(seed: ...)], constrain inputs, and rerun deterministic seed. |
Security-Critical Rules
These are non-negotiable. Every test suite you write must satisfy all of them:
- Every state-mutating external function has a positive test, plus negative tests for explicit rejection conditions.
- Every access-controlled function is tested with authorized and unauthorized callers.
- Expected panic messages are asserted with
#[should_panic(expected: '...')] — not bare #[should_panic].
- Event assertions use
spy_events + assert_emitted with full event data — not just event count.
- Fuzz tests use fixed seeds for reproducibility.
References
- Testing patterns and snforge API: legacy-full.md
- Cross-skill handoff format:
../references/skill-handoff.md
- Module index: references/README.md
- Security regression recipes:
../cairo-auditor/references/vulnerability-db/
Workflow
Eval Gate
When testing/security rules in this skill or its references change, update at least one case in:
evals/cases/contract_skill_benchmark.jsonl
evals/cases/contract_skill_generation_eval.jsonl