| name | implement-ticket |
| description | Read, reassess, and implement a repository ticket when the user asks to implement a ticket or provides a ticket path. Use for ticket-driven work that must validate the ticket against the current codebase, surface discrepancies before coding, then implement and verify the full deliverable set. |
Implement Ticket
Use this skill when the user asks to implement a ticket, gives a ticket file path, or clearly wants ticket-driven execution. Covers both code-changing tickets and non-code deliverables (measured decisions, archival updates, series-status corrections).
Required Inputs
- A ticket path, glob, or enough context to locate the ticket
- Any extra constraints from the user
Working Notes
Load references/working-notes.md for the working-notes checklist, commentary usage, and the 1-3-1 boundary reset ledger format. Emit the compact working-notes checkpoint before coding. If resuming after context compaction, interruption, or a long handoff and the visible context does not include that checkpoint, reload the working-notes reference and reconstruct or re-emit the checkpoint before any further file edit. If the handoff only says a checkpoint happened but does not preserve the full ticket-named deliverables ledger, treat the checkpoint as incomplete: rebuild that ledger from the active ticket, final diff, and untracked files before any further edit or terminal closeout. If the handoff names an in-flight command, terminal session id, background pid, or still-running proof lane, poll or classify that command before starting any command that can contend for the same package, cache, generated tree, dist, schema artifacts, goldens, benchmark resources, or other proof outputs.
Minimal checkpoint shape:
draft/untracked status: active ticket/spec/sibling state that matters
discrepancy class: blocking vs nonblocking live mismatches
authoritative boundary: what this ticket now owns
proof noun alignment: the ticket's claimed invariant noun, the required observable fields/values, and whether the proposed witness proves behavior rather than only invocation, plumbing, or reachability
expected generated fallout: schema artifacts, goldens, compiled JSON, or none
verification substitutions: any ticket-command or proof-shape corrections
reference guidance loaded: triggered reference files loaded for this ticket, or not loaded + why when a normally triggered reference is intentionally skipped
acceptance-proof lanes: final lanes you intend to cite
ticket graph/status integrity lane: if terminal status, dependency edges, active/archive classification, sibling ownership, or successor/follow-up files may change, plan the repo's narrow ticket-dependency or markdown-integrity check now; use not applicable only when no graph/status edge changes or no such checker exists
output contention / sequencing: classify planned proof lanes that consume or rewrite dist, schemas, goldens, compiled JSON, or other generated trees as parallel-safe or serial-only; name any build/regeneration prerequisite and rerun needed if a later lane cleans the consumed output
terminal status plan: when the ticket status may become terminal; use the repo-local terminal wording already used by the ticket/series (for example IMPLEMENTED, COMPLETED, or an explicit exception status), and keep that terminal status pending until final lanes are green, classified, or explicitly substituted
ticket-named deliverables ledger: for tracked or active draft tickets with explicit What to Change, Files to Touch, artifacts, or named witness files, classify each concrete item before coding as planned, already satisfied / verified-no-edit, needs rewrite, blocked, or needs 1-3-1
- For ticket-named logs, reports, traces,
results.tsv, lessons.jsonl, campaign outputs, generated artifacts, or other durable evidence files, classify both the semantic deliverable and the delivery path before coding: checked-in artifact, ignored ephemeral artifact plus ticket/report transcription, or ignore-rule change. Use git ls-files -- <path> and git check-ignore -v -- <path> or the nearest concrete parent/pattern when the file does not exist yet.
- If a ticket explicitly requires a checked-in artifact but the live ignore rules hide it, treat that as deliverable drift, not a late hygiene issue. Stop for
1-3-1 unless the ticket already authorizes the alternative delivery path or the fix is a narrow ignore-rule correction that preserves the explicit deliverable.
intra-ticket contradiction ledger: when Acceptance Criteria, Out of Scope, What to Change, Files to Touch, dependency text, or sibling-owner prose disagree, list each conflict, choose the current precedence, and classify it as ticket correction planned, resolved by explicit sibling owner, or needs 1-3-1 before coding
single-use migration-script ledger: when a ticket names a one-shot migration/helper script, classify it as retained, run then deleted, unnecessary after live inventory, or needs 1-3-1; record the durable evidence location if the script is not retained
commit-body / durable evidence deliverables: commit-body evidence, seed rationale, failure output, re-bless lines, or other ledgers the ticket requires; for no-commit sessions, plan the checked-in ticket/report/final-closeout location that will carry the evidence, or stop for 1-3-1 if the commit body itself is semantically required
red-gate materiality ledger: for benchmark/measured-gate tickets, record baseline, decisive final, target, delta, percent change, verdict, and terminal status allowed?; use not applicable for non-measured tickets
diagnostic metric gates: for architecture, migration, proof, or non-benchmark tickets that embed a numeric/percentage canary, classify each metric as terminal acceptance, diagnostic evidence, or successor input before coding; if the classification changes an explicit deliverable or terminal gate, stop for 1-3-1 unless already authorized
authorization ledger: when a user approves a 1-3-1 option or other boundary reset, record the option label, confirmation, and durable repo location where that approval is reflected
semantic corrections: any stale draft expectation, example, or output-shape claim proven wrong by live evidence
deferred sibling/spec scope: broader spec or series work explicitly confirmed out of scope, when relevant; when naming a sibling as owner, record whether that sibling was opened and confirmed, or why the active spec is sufficient
source file size risk: optional but expected when a named source file is already near or over repo guidance and the ticket will add logic there, or when the ticket creates a substantial new source file likely to carry most of the implementation; record extract-now, defer-with-rationale, or stop-for-1-3-1 before the risk becomes a final-proof surprise
- In repos with an 800-line cap, treat an existing named source file at
>=600 lines, or any source file expected to gain >=100 lines, as near-cap for this checkpoint. Run a cheap size check before coding when practical and record whether the implementation will split now, stay local with a deferral rationale, or stop for 1-3-1.
- For profiling or investigation tickets, repeat this check when live evidence selects an unlisted source file as the implementation target. If the file may be near or over repo guidance, run a cheap size check before or immediately after choosing that file and record whether extraction is in scope.
- For new source files that grow past the repo's typical size band or are plausibly headed there, run a cheap size check before the first final-proof lane and decide whether to split now, defer with rationale, or stop for
1-3-1.
- If the session retains active growth in a near-cap or over-guidance source file, plan a compact before/after closeout ledger before final proof:
path | before lines | after lines | crossed cap? | active growth | extraction/defer rationale | successor if any. This applies both to files that were already over guidance and to files that cross the cap because of the current implementation.
runtime surface breadth: for performance, profiling, diagnostic, or cleanup tickets that touch shared code, classify the changed behavior as ticket-specific, policy/agent-only, script/profile-only, or shared engine/kernel. Name any non-agent paths that can reach the changed code when that matters to closeout or user expectations.
High-Signal Reminders
These are the reminders whose canonical guidance lives nowhere else in this skill. Rules about final-proof gating, touched-file scope sweeps, acceptance-command reconciliation, representative-corpus preflight, post-proof-edit invalidation, sibling handoffs, and already-landed slices are covered in their respective reference files.
- If the user-provided ticket path does not resolve, do a quick normalized-id/stem search across active tickets before assuming the request is blocked. Proceed only when the replacement ticket is unambiguous, and record the correction in working notes.
- Resolve any user-provided spec, doc, or ticket globs such as
specs/157* to exact live files before reassessment. Record the exact resolved files in working notes so later proof and closeout do not depend on an ambiguous shorthand.
- If live reassessment changes an explicit ticket deliverable rather than only clarifying proof shape — for example the required artifact, the default reproduction command, the owned witness shape, or another user-facing contract the ticket explicitly promised — do not silently rewrite that boundary just because the draft is wrong. In this repo,
AGENTS.md Ticket Fidelity still applies: stop for 1-3-1 unless the user has already authorized that class of deliverable correction.
- Treat explicit
Files to Touch, What to Change, artifact lists, and named witness files as deliverable candidates during reassessment. If a named file proves unnecessary because the live owner moved or the contract is satisfied elsewhere, record it as already satisfied / verified-no-edit or needs rewrite in working notes before coding. Use 1-3-1 when leaving it untouched would change a user-facing deliverable rather than merely correcting stale path ownership.
- If one ticket contradicts itself, do not silently pick the convenient clause. When
Acceptance Criteria promises a deliverable but Out of Scope, sibling text, or Files to Touch implies deferral, treat the stricter explicit deliverable as authoritative until the ticket is corrected or the user confirms a 1-3-1 reset. An explicit sibling owner can preserve the deferral only after you record that owner and correct the active ticket before coding or final proof.
- When Foundation 14 no-alias cleanup removes or renames an exported/internal identity, classify every required consumer edit before coding as
identity/type-only fallout or behavioral ownership. Type-only fallout needed to make the retired identity disappear is usually part of the current atomic migration even when the draft Files to Touch omitted that consumer; behavioral routing, dispatch, or semantic adoption can still remain with the named sibling. Update the active ticket plus any sibling/spec wording before final proof so the series does not confuse "old identity removed" with "future behavior adopted."
- Use a concrete deliverable-correction detector before editing active acceptance text: if the durable witness changes nouns such as
score -> value, whole corpus -> supported subset, exact seed -> surrogate, live simulator -> synthetic fixture, or regenerate artifact -> validate unchanged, treat that as an explicit deliverable correction and stop for 1-3-1 unless already authorized. Do not hide that change as a mere proof-shape clarification.
- Classify stale public-output examples before coding. If an acceptance example names the wrong serialized representation but live precedent proves the canonical public contract and the owned invariant is unchanged, record a
public-contract representation correction, update the active ticket before final proof, and rerun the focused witness that proves the canonical shape. Example: hidden value sentinel in resolvedRefs -> omitted hidden concrete value plus hidden outcome/breakdown. If no established live precedent proves the replacement shape, or the correction changes the owned invariant, witness artifact, behavior, or user-facing deliverable noun, stop for 1-3-1 instead of treating it as a semantic clarification.
- For shared in-memory contract migrations from scalar/ready-only values to explicit status wrappers, build the wrapper migration map before broad proof: wrapper type, producers, all consumers, ready-only numeric aggregation points, unavailable/non-ready branches, and raw-value assertions in tests. Preserve the ready path explicitly instead of letting
ready become an accidental special case, and add or update focused tests for unavailable statuses before using package/workspace proof as the first assertion sweep.
- Treat test-layout and fixture-layout substitutions the same way when the ticket makes them explicit. Changing
new test file -> existing test file, checked-in fixture -> in-memory fixture, or production corpus artifact -> synthetic/public-seam fixture is a witness-artifact correction unless the ticket already allows that alternative. Stop for 1-3-1 or record prior authorization before relying on the substituted witness.
- For a ticket-named single-use migration or rewrite script, do not let
not retained blur into not delivered. Before coding, classify the script as retained, run then deleted, unnecessary after live inventory, or needs 1-3-1. If live inventory proves no script is necessary, update the active ticket before final proof with the inventory result, the direct migration method, and the durable evidence location that replaces the script artifact.
- For mixed policy/VM/FFI tickets, classify
value, score, row, candidate, and batch nouns separately before closeout. A ticket that explicitly promises score-producing candidate rows cannot be closed on value-only parity unless the active ticket has first been split, rewritten, or user-authorized; a ticket that already defines a supported subset plus a fail-closed successor may close only after the residual score/candidate owner is explicit.
- When changing the default from an interpreter, closure tree, fallback runtime, or direct evaluator to a narrower VM/bytecode/native route, build a support matrix before deleting fallback calls: expression families and value types, context shapes, production entrypoints, direct-evaluator-required cases, fail-closed cases, and tests for each routed class. Make the distinction durable between
retired fallback path and still-required direct evaluation for unsupported semantics; do not accidentally treat a successful default flip as proof that every non-VM evaluator can be removed.
- For score-row parity, choose the oracle that evaluates the exact row-batch contract. Do not compare against a higher-level policy selection, preview-gating, or orchestration path unless the ticket explicitly owns production routing; those paths can reorder, gate, or reclassify candidates outside the score-row seam.
- For Spec 162 / Foundation 20 preview-signal tickets, keep the proof families separate: engine preview-signal integrity belongs in focused engine tests, while profile convergence or policy-quality witnesses belong under
packages/engine/test/policy-profile-quality/ and use the live lane marker convention (for example @profile-variant) rather than invented markers. Treat silent coercion of unavailable preview refs into numeric contributions as a potential engine integrity bug when live RED evidence proves it, but do not move profile-quality claims into determinism just to make them blocking. Preserve explicit fallback/no-contribution provenance in the trace witness and classify advisory policy-quality lane failures separately from engine invariant failures.
- A stale verification command is not automatically the same as a changed deliverable. If the owned artifact and witness stay the same, but the ticket's literal command is repo-invalid or points at the wrong proof lane, correct the command in the active ticket before final proof instead of forcing
1-3-1. Reserve 1-3-1 for cases where the owned deliverable itself changes.
- If the user interrupts implementation with a request to reassess options against
docs/FOUNDATIONS.md, treat it as a live boundary-reset step, not as a side comment. Pause file edits, present a repo 1-3-1 with the concrete problem, three options, and one recommendation, then wait for confirmation. After approval, record the selected option in the authorization ledger, patch the active ticket/spec/sibling wording when the approved option changes a deliverable, witness path, proof lane, or ownership boundary, emit a fresh working-notes checkpoint for the approved slice, and only then resume coding. When the approval keeps the active ticket closeable but narrows or defers part of the draft, use references/boundary-reset-recovery.md's confirmed narrowed/deferred re-entry path.
- Use this compact sequence after approval:
record authorization -> patch the active artifact(s) -> re-emit the working-notes checkpoint -> re-extract acceptance/proof lanes from the corrected artifact -> resume implementation. Do not let the approved reset live only in conversation when it changes the durable boundary or proof story.
- When acceptance includes future or remote CI observation that cannot run in the local session, do not silently drop it or overclaim it. Prove the same workflow/artifact boundary locally where possible, record the CI observation as external/pending/not-locally-run in the active ticket before final proof, and use the repo-local terminal status only when the owned implementation is complete and the remaining CI observation is evidence external to the implementation. If CI itself is the owned deliverable and no local surrogate can prove it, keep the ticket nonterminal or stop for
1-3-1.
- If a ticket explicitly requires evidence, seed rationale, failure output, or another ledger in the implementing commit body, but the user has not asked you to commit, do not let that deliverable disappear. If the commit body is semantically required, stop for
1-3-1; otherwise rewrite the active ticket before final proof so the ticket outcome, checked-in report, or final closeout becomes the durable evidence ledger for the no-commit session.
- When a ticket asks for a diagnostic game-data artifact under
data/games/... but production profile or production rules changes are out of scope, keep the artifact explicitly diagnostic. It may live beside production data only when it is not imported by the production entrypoint or is otherwise non-authoritative/test-loaded. Record in the ticket outcome whether the artifact is production-imported, test-overlaid, or read-only evidence so Foundations #1/#2 and Ticket Fidelity both remain clear.
- For golden-fixture tickets, include a compact provenance ledger before closeout: fixture path, seed/source state, replay prefix or decision count when relevant, normalized excerpt shape, byte-identity oracle, and the re-bless diagnostic text or commit-body substitute. If any of those replaces a ticket-named witness shape, treat it as a witness-deliverable correction and use
1-3-1 unless already authorized.
- When a ticket names wildcard acceptance checks or grep-based emptiness proofs, validate early whether the literal pattern matches the true owned boundary. If the pattern overreaches into intentional derived surfaces outside the mutable slice, narrow the proof to the truthful owned invariant and record that correction in the active ticket before final closeout.
- When an acceptance command is a grep-based emptiness proof (
returns zero hits, returns empty, or equivalent), remember that rg exits with status 1 when it finds no matches. Treat that combination as a passing proof result when the command otherwise ran cleanly; reserve failure classification for actual matches, stderr/tooling errors, or an over-broad pattern that still needs boundary correction.
- When a ticket owns a grep/scanner over generated or bundled artifacts, prove the scanner against the live emitted signal instead of only the source-level violation noun. Generated output can transform a forbidden input into a sentinel stub, loader artifact, sourcemap marker, or runtime helper name, while minified code can create unrelated substring false positives. Use a deliberate negative edit or fixture to inspect the actual generated artifact, classify literal false positives and transformed sentinel markers, and update the active ticket before final proof when the live generated signal differs from the draft grep target.
- When a ticket owns regenerated fixtures or other generated artifacts and the repo provides a nearby helper script, validate that helper against the current live runtime/API seam before relying on it as the authoritative regen path. If the helper partially rewrites owned artifacts and then fails, treat the entire owned artifact set as dirty, regenerate it again through a known-live seam, and only then continue toward final proof.
- Before regenerating every artifact named by a stale ticket, inspect whether each artifact actually serializes the changed contract. Regenerate contract-bearing artifacts, validate unchanged artifacts through the relevant proof lane, and stop for
1-3-1 before changing an explicit ticket deliverable from "regenerate" to "validate unchanged."
- When migrating an internal helper shape such as
runtime -> runtime to state -> state, search for all direct writers of the same runtime or structural fields, not only existing helper call sites. Reconcile equivalent semantic insertion paths before final proof so another writer cannot preserve the stale snapshot behavior or bypass the new state-derived helper contract.
- When replacing a custom loop, helper, or probe with a richer canonical primitive, do a default-behavior parity check before first proof. Compare the old path's incidental work against the new primitive's defaults: trace retention, delta/snapshot computation, hook emission, profiler behavior, runtime forking, kernel options, and any retained logs or diagnostics. Preserve the old lightweight behavior explicitly unless the ticket owns the broader output or work.
- If the richer primitive forks or encapsulates runtime state that the old path sampled directly, verify whether each ticket-owned diagnostic field still observes the same authoritative runtime. If a metric becomes hidden, stale, or sampled from the wrong owner, stop before closeout: either expose/prove the equivalent metric through the canonical seam, rewrite the ticket with the semantic correction, or use
1-3-1 when that changes an explicit diagnostic deliverable.
- When optimizing a
WeakMap, memo table, draft cache, or other mutable lifetime surface, decide the aliasing model before coding: immutable snapshot, private mutable view, copy-on-write alias, or run-local discard. Add a focused aliasing regression proving that any state/cache visible after the private scope cannot be corrupted by later draft mutation, and include the cache-copy/counter evidence in the ticket outcome when the ticket owns lifetime cost.
- Distinguish pure derived-value
WeakMap memoization from mutable cache lifetime work. If the cache is keyed by immutable object identity and stores an immutable deterministic value, prove value/hash/determinism parity plus object-lifetime scoping; require aliasing regression only when cached values are mutable, share descendants, influence later mutation, or can leak outside the private scope.
- For persistent caches, memoized accelerators, or any optimization where correctness can still pass through a fallback path, include an activation witness for the accelerated route before closeout. Use the narrowest durable proof available: a hit/miss counter, trace/log assertion, subprocess cache-hit probe, or focused test that would fail if the cache is written but never read. Do not treat a green correctness suite as proof of cache adoption unless that suite also proves the accelerated route was exercised.
- For persistent-cache tests that must prove both byte-equivalence and read activation, keep those proof shapes separate. Use normal miss/write/read flow for equivalence against the uncached oracle; use an isolated activation test with a deliberately distinct valid sentinel entry, cache-hit counter, or equivalent witness so fallback compilation cannot satisfy the assertion. Do not let the sentinel path contaminate the byte-identity proof.
- For persistent-cache optimizations that change cache entry contents, storage location, warm/upload behavior, or read/write path cost, include a compact cache tradeoff ledger before closeout:
old entry size, new entry size, read/parse or hit time, write/warm time, artifact transfer relevance, and retained/rejected verdict. A smaller CPU metric is not enough by itself when the change materially increases disk I/O, generated artifact size, CI artifact payload, or warm-step cost.
- When extracting or exporting a deterministic primitive helper such as a hash continuation, canonical serializer, id normalizer, or pure digest helper, decide whether proof belongs at the helper seam or the nearest public deterministic seam. Existing oracle tests are sufficient only when they assert exact canonical output through the helper's real public consumer after the helper change; otherwise add a direct helper-level parity test that proves
old shape == new shape for representative inputs.
- If an optimization changes canonical hash values, digest serialization shape, insertion/key ordering, persisted identity, or any comparable deterministic fingerprint, treat that as an explicit deliverable correction and Foundations F8 risk unless the ticket already authorizes a current-format identity migration. Stop for
1-3-1 before retaining the change, update the active ticket/spec with the authorized identity boundary, and prove deterministic parity at the nearest canonical seam. Do not present "hashes are only accelerators" as sufficient by itself when the ticket promised stable canonical hashes.
- Once final proof starts, treat any later active ticket/spec/report edit that changes status, outcome, touched files, command ledger, acceptance wording, or other closeout metadata as proof-affecting. Reconcile the edited artifact immediately and rerun the narrowest affected proof lane before citing final acceptance.
- Exception: append-only recording of a just-completed proof result does not invalidate that proof when it does not change status, acceptance criteria, command shape, scope, expected outcome, or any other contract. Rerun only the checks affected by the new or changed claim.
- Closeout exception: after all final lanes are green, classified, or explicitly substituted, setting the active ticket's terminal status and transcribing the exact proof results may preserve the just-run proof only when those edits do not change scope, acceptance criteria, command semantics, touched-file ownership, proof claims, or follow-up/dependency classification. Record the no-invalidation decision in the ticket outcome or final closeout when relying on this exception.
- After any later proof-affecting ticket/spec/report edit, reconcile earlier no-invalidation prose before final status. Search the edited outcome/ledger text for stale claims such as
no-invalidation, terminal closeout, or status/proof transcription only, and remove or rewrite any line that now contradicts the later acceptance, scope, command, proof, or touched-file change.
- If an explicit ticket-named acceptance lane is red after the owned slice lands, do not mark the ticket with the repo-local terminal implementation status while classifying later. First classify each failing file/test. If any failure is on the changed execution path, a newly modified serialized/shared contract, or an architectural invariant the ticket touched, treat it as active-ticket-owned until proven otherwise and stop for
1-3-1 if the next fix or boundary change is not already user-authorized.
- Keep ownership classification separate from causal proof.
active-ticket-owned until proven otherwise means the ticket cannot close yet; it does not mean the candidate caused the failure. For expensive timeouts, benchmark regressions, or other unstable acceptance failures, write owned/unclassified or equivalent until same-environment baseline, A/B worktree evidence, or another direct comparison proves causality.
- If a ticket status was already set to a terminal implementation status and a ticket-named lane later turns red, stalled, or unclassified, immediately reconcile the status before continuing: downgrade to a truthful nonterminal/partial state unless the lane classification or user-approved acceptance exception is already written into the ticket.
- When a broad package/workspace lane reports a file failure, timeout, or stale-looking runner label but the direct child command or narrower package lane gives contradictory evidence, load
references/verification-noisy-harness.md before terminal status. Use it to distinguish child-test failures, wrapper defects, harness noise, stale reporter labels, and non-final broad-lane evidence before recording the closeout.
- If the user explicitly accepts a red benchmark or measured gate as "close enough," treat that as a user-approved acceptance exception, not as a passing gate. Update the active ticket/spec before final proof with exact red metrics, the original threshold, the authorization basis, residual risk, and whether follow-up is intentionally omitted. Use the repo's closest explicit exception status wording, such as
COMPLETED by user-approved acceptance exception only when that matches the ticket family; never rewrite pass=false into pass=true.
- When a measured gate remains red after the correctness/code slice lands, and the referenced spec/ticket has explicit stop conditions, phase gates, or "when to abandon" clauses, scan those clauses before recommending more same-ticket optimization. If a stop condition may have fired, use
1-3-1 with options such as partial/blocked closeout plus follow-up, re-spec, or a clearly justified same-ticket optimization. Do not default to continuing local tuning merely because another plausible optimization exists.
- When a numeric or percentage metric appears inside an architecture, migration, proof, or cleanup ticket, do not assume it is a benchmark-style terminal gate. During reassessment, classify the metric as
terminal acceptance, diagnostic evidence, or successor input. If live evidence shows a diagnostic canary remains red while the architecture slice can still truthfully land, update the active ticket/spec/sibling owner before final proof; if that changes an explicit acceptance deliverable, use 1-3-1 unless already authorized.
- For variable benchmark or measured-gate results, distinguish diagnostic samples from the one decisive final sample before writing durable status/spec prose. Prefer labels such as
*-baseline, *-candidate-probe, and *-final, and make the ticket outcome name exactly one decisive final sample. Prefer transcribing one final same-command run after code, ticket, spec, successor, and dependency-graph edits are settled; label earlier runs as diagnostic unless they are the accepted baseline/comparison. If a final rerun drifts, update the active ticket/spec/dependency artifacts once with the exact final metric and run pnpm run check:ticket-deps when the ticket graph changed.
- Do not run decisive benchmark/profile samples concurrently with tests, builds, other profiles, dev servers, or artifact producers that can contend for CPU, I/O, cache, generated output, or process-local metrics. If a sample was captured under contention, label it diagnostic/contaminated and exclude it from terminal status or materiality decisions until a solo same-command sample confirms it.
- Before creating a successor for a still-red measured gate, hard-check the decisive final metric against the ticket's explicit bar, reviewer note, stop condition, and any user-confirmed "keep optimizing" instruction. If the final result does not meet the authorized closeout bar, continue optimizing the same ticket or stop for
1-3-1; do not create the successor as a decision substitute. If a diagnostic sample suggested a handoff but the final sample is materially different, re-evaluate the handoff before editing ticket graph artifacts.
- If the user challenges whether the successor is duplicating the active ticket, answer that challenge before editing or keeping the handoff. Provide a compact non-overlap proof:
active ticket retained/removes, residual still failing, successor owns, explicit exclusions, and why this is not the same work under a new id. Repair premature successor/spec/dependency artifacts if the proof does not hold.
- For profiling or benchmark tickets with exploratory optimization candidates, keep rejected candidates visible enough to prevent repetition: remove abandoned code/tests/counters, rerun or cite the cleanup proof that matters, and add a compact attempt ledger when the negative result is relevant to future owners. Sweep both generic and literal residue: helper APIs, imports, tests, counters, ticket claims, docs, branch labels, syntax variants, and string literals such as
case 'adjacent' or outcome names that belong only to the rejected path. Do this even when the accepted slice is small and the final gate remains red.
- Before creating or updating a successor/follow-up ticket, dependent-ticket rewrite, spec ticket-list update, or any other ticket-graph handoff, load
references/closeout-and-followup.md. Use that reference's successor and dependency-integrity checklists before editing those artifacts, not only after proof.
- Before authoring a new successor/follow-up ticket, perform the ticket-authoring preflight from
references/closeout-and-followup.md: inspect active overlap, read tickets/README.md and tickets/_TEMPLATE.md unless an already-current series-local format is the authoritative template, and record that rationale when you rely on the series-local format.
- For ticket/spec graph rewrites, use
apply_patch or a checked-in repo helper. Do not use ad hoc shell one-liners (sed, perl -e, node -e, etc.) for markdown ticket bodies when the edit touches prose, commands, backticks, metrics, status, acceptance text, or dependency ownership. A shell rewrite is acceptable only for a single literal path/id/status replacement that is quote-safe and has no command text or metric prose; inspect the affected hunk immediately afterward.
- For high-blast-radius graph closeout, prefer staged patches over one giant patch. A good sequence is: add successor, update spec/deps/status owners, update active ticket outcome, then apply terminal status when final lanes are settled. Inspect
git status --short or git diff --name-status between stages so one stale hunk cannot hide whether a successor file, dependency rewrite, or status edit actually landed.
- After a ticket/status patch that changes terminal status, successor ownership, dependency edges, ticket-list entries, or adds/deletes ticket files, immediately run
git status --short or git diff --name-status before continuing. Confirm the changed path set matches the intended graph edit, including untracked successor files, before running or citing final proof.
- If new untracked same-series sibling tickets appear mid-run, and the active ticket or spec references deferred sibling scope, open those sibling files far enough to confirm id, status, dependency role, and residual ownership before final closeout. If they are concurrent unrelated drafts, record that classification; if they change the active boundary or graph story, update the active ticket/spec first and rerun the affected final-proof gate. Use this compact status-delta ledger when only some new siblings are relevant:
new same-series untracked paths, opened because referenced, not opened because not referenced, active-boundary impact, and final classification.
- During ticket/spec graph sweeps, keep shell searches markdown-safe: prefer single-quoted
rg patterns, plain-string anchors without markdown backticks, or direct file reads. Never put literal markdown backticks inside double-quoted shell strings.
- Markdown-safe examples: use
rg -n '150FITLWASM-027.md' tickets specs or split code-span checks into plain id/path anchors. Avoid double-quoted patterns such as "successor owner tickets/FOO.md" because the shell executes backtick contents before rg runs.
- If you discover that a successor or ticket-graph edit was premature or wrong, repair it explicitly before continuing: restore/delete/update the affected ticket artifact,
rg both old and new ids across tickets specs docs, verify no stale references remain, run pnpm run check:ticket-deps when graph edges changed, rerun any proof lane invalidated by changed code or proof claims, and explain the corrected artifact state in the next user-visible update or final response.
- If the user asks for a
FOUNDATIONS.md-aligned reassessment mid-run, or confirms a recommended reassessment option, load references/boundary-reset-recovery.md, restate the new authoritative boundary, update the active ticket before final proof, and update sibling/spec artifacts when the corrected boundary changes their claims, dependency story, or ticket list. Then continue under the confirmed boundary.
- If the user asks to reassess proposed options before choosing, answer the option question directly before any file edit: map each option to
docs/FOUNDATIONS.md and AGENTS.md constraints, reject or demote options that conflict with those rules, restate the recommendation, and wait for explicit confirmation. After confirmation, record the approved option in the authorization ledger and continue under the reset checklist below.
- After confirmation, use this compact reset checklist before coding or resuming: re-inventory the whole mismatch class; sweep the same paragraph, acceptance criteria, and nearby spec prose for sibling stale nouns from the same imagined contract; update the active ticket's status/scope/acceptance language; create or update the follow-up owner if work is split; if a deliverable is deferred to existing later work, open the nearest named sibling ticket(s) long enough to confirm ownership or record why the active spec is sufficient; update specs and sibling ticket lists that changed; reconcile stale commands and artifact expectations; emit a refreshed working-notes checkpoint; run
pnpm run check:ticket-deps if the ticket graph changed unless the active ticket records why it does not apply; and rerun the final-proof gate before citing acceptance.
- For Foundation 14 identity removals, include the consumer-fallout classification in the refreshed checkpoint and corrected ticket text:
identity/type-only fallout absorbed here, behavioral sibling scope deferred, and the exact sibling/spec owner for deferred routing or semantic adoption. This prevents a no-alias compile fix from silently absorbing the sibling's behavior ticket, or a sibling deferral from preserving a compatibility alias.
- For exact numeric-domain corrections, do not round, truncate, or hide non-integer rule-authoritative values inside a new compiler/runtime path. If the user approves integer normalization, record the scale factor or transformation rule, prove it preserves the intended ordering/ratio semantics, sweep authored and generated mirrors, and add the proof surface to the ticket before final verification.
- If parity work proves the supposed TypeScript/reference path violates
FOUNDATIONS.md numeric or determinism rules, fix the reference path through the focused failing witness instead of weakening the new path. Record the semantic correction in the active ticket outcome and keep the ABI/runtime target aligned to FOUNDATIONS.md.
- If the user interrupts with a correction or asks why a handoff artifact exists, answer that correction directly before resuming. State the current artifact state, repair any premature graph/status edits under the newest confirmed boundary, and then continue the implementation only if the user has not asked you to pause or wait.
- If you become materially stuck after a partial repair, or you now have multiple plausible fixes with different tradeoffs and the next step is no longer clearly user-authorized, stop for the repo's
1-3-1 workflow instead of continuing to iterate. State the problem, give 3 options, recommend 1, and wait for confirmation before implementing another path.
Final-Proof Gate
Before the first lane you intend to treat as the final acceptance-proof run, stop and verify all of the following. The ticket's scope, outcome draft, and proof plan should already be truthful before final proof starts, but avoid setting a terminal status such as this repo's IMPLEMENTED or another series-local equivalent until final proof and any ticket-named red/non-final lane classification are done, unless the ticket explicitly records a user-approved acceptance exception or narrowed proof substitution. For active draft tickets, write pending completion as prose in the outcome/proof plan if needed; reserve the **Status** field's terminal value for the point after all named lanes are green, classified, or explicitly substituted.
- the active ticket's intended durable state and outcome block already match the live intended result (
BLOCKED, narrowed scope, pending repo-local terminal status such as IMPLEMENTED, user-approved exception, etc.), without implying unproven broad lanes are green
- For terminal closeout, make this concrete before changing status: the active ticket must state
what landed, exact verification, schema/artifact fallout, deferred scope or sibling owner, and late-edit proof validity/no-invalidation. If a field truly does not apply, write none or not applicable; do not close with a bare status flip plus command list.
- any command substitutions or ticket-correction ledger entries are already written into the active ticket when needed
- any sibling-ticket, dependency, spec, or touched-file-scope edits required for a truthful closeout are already done
- every literal ticket command or shorthand command bundle has been accounted for before terminal status. Reopen the ticket's
Acceptance Criteria, Test Plan, Commands, outcome draft, and any command-like checklist lines; reconcile them against the exact wrapper commands you intend to cite (for example root pnpm test versus pnpm turbo test). For commands in multiple sections, shorthand bundles, or mixed direct/focused substitutions, write a compact pre-terminal command ledger before setting terminal status:
ticket section | literal command/shorthand | ran directly/subsumed/split/not run | final citation
- Example:
Test Plan | pnpm turbo lint typecheck test | split into pnpm turbo lint + pnpm turbo typecheck + pnpm turbo test | all three cited green
- the exact final proof order is chosen and no later ticket-artifact rewrite is still expected
- stable-output proof sequencing is settled before any final lane starts: no final proof lane is running in parallel with a build, schema, or artifact producer that can clean or rewrite the same output tree; a zero-test or module-resolution "green" from an overlapped compiled-output lane is invalid until rerun serially
- If a final proof consumes
dist/ or another generated tree, schedule build-producing broad lanes before that final consumer proof when practical. If a later accepted lane rebuilds or cleans that tree, rerun the narrowest affected generated-output consumer proof before citing it as final.
- any previously failed ticket-named broad lane has already been classified in the active ticket as
owned failure, same-series residual / dependency blocker, or repo-preexisting unrelated blocker; do not use the repo-local terminal implementation status while a changed-path or architectural-invariant failure is still unclassified or still active-ticket-owned
- if the active ticket already has a terminal status and any later code/test/spec edit or ticket-named red/stalled/unclassified lane occurs, the status has been reconciled first: downgrade to pending/partial/blocked, or record the already-approved acceptance exception before continuing final proof
- substantial logic added to an existing source file has a cheap file-size check before final proof (
wc -l or equivalent), especially when the file may already be near or over repo guidance. In repos with an 800-line cap, treat >=600 current lines or expected growth of >=100 lines as enough to trigger the check. If the touched file is preexisting oversize plus active growth, extract the new helper when clearly in scope, stop for 1-3-1 if extraction would be nontrivial or widen the ticket, or record a justified deferral before closeout. If a touched file crosses the repo cap because of active growth, do not describe it as preexisting oversize; classify the cap crossing explicitly. When a touched source file starts near-cap, starts over-guidance, or ends over-guidance, include a compact closeout ledger: path | before lines | after lines | crossed cap? | active growth | extraction/defer rationale | successor if any.
- For profiling or benchmark tickets, treat this as a pre-edit gate once live evidence selects a large target file. Run
wc -l before adding new logic when practical; if the file is already over guidance and extraction is nontrivial, record the deferral before coding rather than discovering the oversize state only during final proof.
- the final touched-file sweep uses
git status --short or an equivalent untracked-aware check, not only git diff / git diff --stat, so newly added ticket deliverables cannot disappear from the closeout view
- When the ticket adds files, do not summarize the touched-file surface from
git diff --stat alone; pair it with git status --short or explicitly list the untracked paths so new tests, fixtures, reports, or generated artifacts are visible in the closeout.
- Remember that
git diff --check does not inspect untracked files unless they are staged or checked through an equivalent path. If final hygiene depends on whitespace/patch checks, include newly added source, test, fixture, ticket, and report files through staging, a targeted equivalent check, or an explicit rationale that another final lane already covers the relevant hygiene.
- For a targeted untracked-file whitespace check such as
git diff --no-index --check /dev/null <path>, the command exits nonzero for ordinary content differences; treat empty stdout/stderr as whitespace-clean and reserve failure for reported whitespace diagnostics or command errors.
- For referenced active tickets, specs, or siblings that were already dirty or untracked before implementation, classify their final provenance before closeout:
touched by this implementation, pre-existing and still unrelated, read-only context, or concurrent/sibling draft. Do not claim the final touched-file surface from status alone when pre-existing dirty referenced artifacts are mixed with implementation edits.
- if any ticket dependency, successor ticket, sibling ticket, archived-reference path, terminal status, active/archive classification, or other ticket-graph/state edge changed,
pnpm run check:ticket-deps has already run and been recorded, or the active ticket explicitly records why that check does not exist or does not apply
- for a red measured-gate exception closeout, the decisive final same-command metric has already been rerun and copied everywhere, the active ticket no longer implies a green gate or unproven wall-clock win, the successor/residual owner is explicit, and the terminal status is still pending until those facts plus dependency checks are settled
- if final proof requires a new or updated successor/follow-up ticket, dependent-ticket rewrite, spec ticket-list update, or similar ownership handoff,
references/closeout-and-followup.md has already been loaded, the successor-ticket authoring preflight has been satisfied, those artifacts already exist, and pnpm run check:ticket-deps has run before the first proof lane you intend to cite as final, unless the active ticket explicitly records why that check does not apply
- for benchmark/measured-gate tickets, the active ticket or closeout includes a compact materiality ledger:
baseline, decisive final, target, delta, percent change, verdict, and terminal status allowed?; vague user wording such as "next owner" is not a replacement for this ledger or for required 1-3-1 confirmation
- the final response plan includes the dirty-state delta from
git status --short, including newly created untracked files, plus the ticket's archive status (post-ticket-review already ran, archived, or implemented but not archived)
- the final response plan includes proof validity after post-proof edits: either affected proof lanes reran, or the active ticket/final closeout records the exact no-invalidation rationale
- for shared performance or cleanup changes, the closeout names the runtime surface breadth:
ticket-specific, policy/agent-only, script/profile-only, or shared engine/kernel, with any relevant non-agent paths called out
If any answer is no, update the ticket and related artifacts first, then start the final acceptance-proof set.
For ordinary non-measured tickets whose final lanes go green, use this compact green closeout order:
- Before final proof, make the active ticket truthful about scope, command substitutions, generated fallout, touched-file ownership, and the proof lanes you intend to cite. Account for every literal command and shorthand bundle from the ticket's acceptance/test/command sections, but leave the repo-local terminal status pending.
- Run the final proof lanes serially with stable generated outputs; if a later accepted lane rebuilds or cleans
dist/, schemas, fixtures, or another consumed output tree, rerun the narrowest affected consumer proof.
- After all final lanes are green, classified, or explicitly substituted, apply the terminal status plus exact proof-result transcription as the final narrow ticket edit when practical.
- If that edit only records the just-run proof and does not change scope, acceptance criteria, command semantics, touched-file ownership, proof claims, follow-up ownership, or dependency classification, record the no-invalidation rationale in the ticket outcome or final closeout instead of rerunning broad lanes.
- If any later proof-affecting edit happened after an earlier no-invalidation note, sweep the ticket/spec/report outcome for stale no-invalidation or terminal-closeout claims and reconcile them before dependency checks.
- Run
pnpm run check:ticket-deps when terminal status, dependencies, successor ownership, or other ticket-graph facts changed or when the active ticket/family expects dependency integrity proof.
- Finish with an untracked-aware
git status --short sweep and hand off to $post-ticket-review unless the user explicitly included archival in the implementation request.
- In the final response, explicitly say whether
$post-ticket-review already ran. If it did not, say the ticket is implemented but not archived and name $post-ticket-review as the next review/archive workflow.
When the decisive proof lane itself determines the final classification, do not pretend the outcome can be fully written beforehand. Instead:
- Prewrite the intended outcome/handoff as
pending or equivalent before the decisive lane, including the exact command and expected decision branches.
- Run the decisive lane only after code, sibling, spec, and command-shape edits that are knowable beforehand are complete.
- Immediately transcribe the exact metric/classification/status into the active ticket and any directly affected sibling/spec artifacts.
- Run
pnpm run check:ticket-deps when ticket graph edges, successor ownership, or statuses changed, unless the active ticket explicitly records why that check does not exist or does not apply.
- Classify whether the post-lane edits were transcription-only or proof-affecting; rerun only the proof lanes invalidated by changed code, command semantics, thresholds, scope, or acceptance boundaries.
Workflow
Ticket-Type Triage
Load references/ticket-type-triage.md to classify the ticket into the smallest live category before loading further references, and to run the category-specific preflights (bounded local refactor, proof/benchmark/audit/investigation, event/card/action-identity repro, gate/smoke/regression historical witness, historical-evidence sufficiency, contradictory live evidence, shared-contract downstream consumers).
For investigation / measurement / fixture-producing tickets, do not default to the bounded-local-refactor fast path merely because the owned files are "just a script plus artifacts." If the ticket predicts an empirical outcome, witness distribution, or measured subset shape, prefer the investigation/proof path unless reassessment proves the evidence surface is trivial and contradiction risk is low.
For bounded campaign diagnostic migrations that import compiled engine artifacts and write ephemeral outputs, use a compact proof checklist before broad lanes: build first; capture the smallest truthful pre-change baseline when parity is required; direct outputs to /tmp or another throwaway path; preserve process-local metrics and lightweight options explicitly; run node --check; run the grep/removal invariant the ticket names; and include a tiny cadence/output smoke after any build-producing final lane when snapshots, reports, or generated filenames are part of the contract.
When the owned metric is process-local (heapUsed, GC time/percent, RSS, resident subprocess state, or similar), verify early whether the named package script / lane wrapper preserves that metric's meaning. If the runner batches multiple files in one process, adds helper subprocesses, or otherwise changes the measured process boundary, treat that wrapper as part of the owned proof surface rather than iterating on thresholds first.
When a proof ticket requires a new calibrated threshold or ceiling, prefer this order unless live evidence proves a different shape is cleaner: first land the narrowest truthful witness/harness, then run the smallest live calibration probe that exercises the owned seam, write the measured values and resulting threshold rationale back into the owned artifact, and only then start the final acceptance-proof lanes. Do not cite a draft or exploratory threshold as final if the ticket artifact still needs to change after calibration.
When a profiling ticket needs a new counter or diagnostic field to expose the owned metric, add the counter before changing the measured behavior when practical, then capture a same-seam baseline and current result with the same output shape. If the baseline predates the counter, record the old proxy metric explicitly, explain why it is comparable, and keep the final verdict on the ticket-owned metric rather than raw wall-clock noise alone.
For profiling or benchmark red-gate tickets, prefer the profiling fast path from Implementation Rules unless live reassessment triggers heavier guidance: load references/working-notes.md, references/ticket-type-triage.md, references/specialized-ticket-types.md, and references/verification.md; defer broader references until split ownership, nontrivial discrepancy, shared-contract/schema fallout, noisy harness behavior, command-wrapper ambiguity, or post-proof invalidation requires them.
When a code migration lands but an explicit benchmark/performance gate remains red, do not mark the ticket with the repo-local terminal implementation status merely because the implementation and ordinary tests are green. Record the measured samples, threshold comparison, and variance. If satisfying or relaxing the gate would change an explicit ticket deliverable, stop for 1-3-1 before creating follow-up tickets, rewriting spec/ticket ownership, or narrowing the acceptance claim. After user confirmation, create or update the follow-up owner required by the ticket/spec, update the active ticket/spec to a truthful blocked or partial state, and run pnpm run check:ticket-deps after the ticket graph changes.
Exception: if the active ticket itself explicitly defines "red measured result + active route proof + successor/follow-up owner" as the acceptance-complete outcome, that ticket-specific contract can close with the repo-local terminal wording only after the ticket records the exact red metrics, active route or implementation proof, successor owner, dependency/status rewrites, and pnpm run check:ticket-deps result or non-applicability. Prefer explicit status wording such as COMPLETED with red measured gate successor <ticket> only when that matches the ticket family, or the repo's closest equivalent, so later review/archive passes can distinguish this from an ordinary green-gate completion or a user-approved acceptance exception. Do not infer this exception from ordinary red-gate language. If the ticket merely asks to "close the gate", "make it green", or otherwise lacks an explicit red-plus-successor completion contract, use BLOCKED, PARTIAL, or the repo-equivalent landed-but-red state instead.
- Use this terminal-status decision table when a measured gate is still red after the owned slice lands:
green retained gate: COMPLETED after exact metric transcription and normal final proof.
explicit red-plus-successor completion contract: terminal repo-local status may be used only with exact red metrics, materiality verdict, active route/implementation proof, non-overlapping successor, dependency integrity, and no-invalidation/rerun ledger.
red/blocked phase decision without explicit terminal wording: keep BLOCKED, PARTIAL, or the repo's landed-but-red equivalent, or stop for 1-3-1 before converting it to a terminal closeout.
user-approved close-enough red gate: record it as an acceptance exception, not as a passing gate; terminal status depends on the ticket family's explicit exception wording.
- Precedence rule for materiality language: if the ticket, spec, or reviewer note requires a qualitative result such as "significant", "meaningful", or "not tiny", classify the final same-command delta as
material, minor, or not demonstrated before using this exception. A red-plus-successor closeout is not authorized on a minor or not demonstrated result unless the user confirms that revised closeout through 1-3-1; otherwise keep the active ticket PARTIAL, BLOCKED, or another truthful non-green state.
- Final-proof ordering exception for this ticket-specific contract: when the successor's exact scope depends on the decisive red measurement, the successor/follow-up cannot always exist before that measurement. In that case, run the decisive measurement only after the active route proof and instrumentation are in place, then immediately write the red metrics, create/update the successor, rewrite dependent tickets/specs, run
pnpm run check:ticket-deps unless the active ticket records why it does not apply, and rerun the narrowest proof lanes affected by those edits. If the post-measurement edits only transcribe metrics and ownership without changing code, command semantics, thresholds, or acceptance boundaries, record why the measurement itself was not invalidated.
- For CPU-profile-backed successors, require the successor itself to include the profiling evidence block, not only the closing active ticket:
profile command, profile artifact path or explicit ephemeral-artifact note, parser command or parser method, baseline/current metric, top owners or residual stack samples, and why the successor is non-overlapping. Do this before terminal status so dependency rewrites and final proof cite the same ownership story.
For proof, inventory, audit, or fixture-producing tickets where the successor scope depends on the decisive classification evidence rather than on a red benchmark threshold, use the same ordering discipline without pretending it is a measured-gate exception. Land the smallest truthful witness/inventory surface first, run the decisive classification command only after required instrumentation exists, immediately record the classification, create/update the successor and affected spec/sibling graph, run pnpm run check:ticket-deps unless the active ticket records why it does not apply, and rerun or explicitly classify every proof lane affected by the post-classification edits.
Red measured-gate closeout checklist after correctness lands:
- Keep the ticket status nonterminal until the decisive same-command measurement is complete, the exact metric is transcribed, successor/dependency ownership is settled, and dependency checks have run or are explicitly classified non-applicable. When practical, make terminal status the final status-only patch after those facts are settled.
- Record exact command, metric(s), threshold, and verdict in the active ticket/report. Remove or soften any causal or improvement claim that the decisive sample no longer proves.
- If the final result is materially worse than historical evidence or an earlier same-seam sample, run the cheapest same-checkout A/B comparison that preserves the owned seam before writing causal language. Use this shape when practical: capture the current baseline with the exact command and activation counters; apply or enable one candidate; rerun the exact command; revert rejected residue; run the smallest cleanup proof; classify the delta. Examples include route enabled vs disabled, old helper vs new helper, current branch vs temp baseline, or another direct comparison. If that comparison is disproportionate or deferred, classify the residual as
current active-route red evidence or owned/unclassified residual, not as proven causality.
- Compare the result to any spec stop conditions, phase gates, or re-spec triggers before proposing more optimization.
- If satisfying or relaxing the gate changes status, scope, dependency story, or phase plan, stop for
1-3-1 unless already authorized.
- If the ticket/spec/reviewer language requires a material reduction and the final classified delta is
minor or not demonstrated, do not apply the red-plus-successor terminal wording without user confirmation through 1-3-1.
- After confirmation, create/update the follow-up owner and update dependent tickets/spec ticket lists before terminal status. Set a truthful durable state (
PARTIAL, BLOCKED, etc.) or the ticket-family's explicit red-plus-successor terminal wording only after the graph and proof story are settled.
- Run
pnpm run check:ticket-deps when ticket graph changes, unless the active ticket explicitly records why that check does not exist or does not apply.
- Rerun the narrowest affected proof or record why the edit is metric transcription only, then continue final proof choreography. If the terminal status edit is separate and status-only, record the no-invalidation rationale in the ticket outcome or final closeout.
Mandatory terminal-status order for red measured gates:
- decisive metric captured and materiality verdict recorded
- successor/spec/sibling/dependency edits applied
pnpm run check:ticket-deps run or explicitly classified non-applicable
- no-invalidation rationale or affected proof rerun recorded
- terminal status applied as a final status-only patch when possible
Compact choreography for a red-gate slice that lands a retained improvement:
- Capture or cite a same-seam baseline with the route/activation counters needed to prove the owned path is exercised.
- Land one candidate at a time; keep it only if focused correctness passes and it improves the ticket-owned metric or root-cause counter.
- Run the decisive same-seam measurement after code and knowable ticket/spec edits are settled.
- If the gate remains red but the ticket allows red-result completion, immediately record the exact red metric, threshold, active-route proof, retained-candidate classification, and rejected-candidate ledger.
- Create or update the non-overlapping successor, rewrite dependent tickets/specs, and run
pnpm run check:ticket-deps.
- If the decisive metric drifts from an earlier probe, update every durable metric reference once, remove or narrow stale "improved" wording, confirm the successor still owns the residual, and rerun
pnpm run check:ticket-deps if graph/status/dependency text changed.
- Rerun only proof lanes invalidated by the graph edits; if edits merely transcribe metrics and ownership, record a no-invalidation note before final closeout.
- Apply terminal status as the final narrow edit when possible; if it lands together with transcription or dependency edits, explicitly classify why the just-run metric and affected proof lanes remain valid or which lanes were rerun.
Optional red-gate outcome worksheet:
diagnostic baseline: command, label, metric, threshold, active-route counters
candidate probes: retained and rejected candidates, diagnostic correctness proof while each candidate existed, measured result, post-revert cleanup proof for rejected paths
decisive final metric: command, label, metric, threshold, verdict, drift from probes
CPU/profile evidence: artifact path or ephemeral note, parser command/method, top owners, ticket-owned samples, residual samples
successor handoff: successor id, non-overlap rationale, dependent ticket/spec rewrites, pnpm run check:ticket-deps result
proof invalidation: post-metric edits, rerun lanes, no-invalidation rationale, terminal status timing
If a profiling, benchmark, or measured-gate ticket proves not to be a runtime optimization problem, do not invent a hot-path fix just to satisfy the draft shape. Record the lane command, metric or budget, verdict, slowest relevant files/tests when applicable, and root-cause classification (stale fixture, workflow gating, harness noise, repo-preexisting, etc.). State whether CPU profiling was unnecessary because no red runtime gate remains, record the non-runtime repair that was accepted, and update ticket/spec/sibling wording so future work does not keep chasing a disproven performance cause.
- For measured-gate harness drift, compare the checked-in test or script setup against the authoritative report, script, archived evidence, or baseline witness before changing code. Classify each setup difference that affects the metric: fixture source, warmup/precompile, runtime initialization, counters, diagnostics/trace level, cached/generated artifacts, or command wrapper semantics. If the change restores the original witness while preserving the same command, threshold, and owned metric, record it as a harness repair. If it changes the measured contract, witness artifact, or acceptance noun, stop for
1-3-1 unless the user already authorized that correction.
- For multi-test perf suites such as
test:perf, do a cheap ownership preflight before treating the broad suite as final proof: identify the owned perf file or subtest, list likely unrelated perf files or known active residual owners when practical, and decide whether final proof requires the whole suite or the owned perf file plus a classified broad-suite result. Record that plan in working notes and the active ticket before launching expensive suite runs.
For profiling or benchmark tickets with multiple plausible optimizations, use a measured experiment loop instead of stacking speculative changes: apply one candidate at a time; run focused correctness proof; run the smallest representative smoke measurement; profile only if the smoke is promising or diagnostically necessary; revert or isolate candidates that regress, do not move an objective ticket-owned root-cause metric, remain red without a user-approved exception, or shift the hotspot outside the owned seam. Keep a candidate only when it produces a real measured reduction on the owned root-cause metric and the active ticket/report stays truthful about any still-red acceptance gate; do not treat a CPU-profile-only win as closeout proof when the ticket-owned wall-clock or parity lane remains red. Keep durable artifacts focused on accepted candidates, plus a compact attempt ledger when negative evidence would prevent repeated work.
- Before closeout, classify each retained performance candidate as exactly one of:
owned metric improved, root-cause counter improved, same-checkout A/B proves neutral, user-approved keep, or revert before closeout. If a final gate is still red or worse and no retained-candidate classification supports keeping the change, do not close the ticket; revert the candidate or stop for 1-3-1.
- If a small cleanup follows an accepted optimization and is not the best measured sample on its own, it may still be retained as part of the accepted slice when it preserves the implementation shape, removes duplication or allocation from the same owned path, remains route-clean/correctness-green, and the combined final metric still supports the red-gate handoff. Record that classification explicitly instead of forcing it into a standalone
owned metric improved claim.
- For rejected optimization candidates, record a compact attempt ledger when the negative result is likely to prevent repeated work:
candidate, correctness proof, measurement, verdict, cleanup proof, and reason not retained.
- When abandoning or replacing a candidate design, do a cleanup sweep before final proof: remove stale helper APIs, imports, tests, counters, ticket claims, and docs from the abandoned path unless the user explicitly wants that exploratory diff preserved. Use
rg for candidate-specific helper names so dead public or semi-public surfaces do not survive as accidental parallel APIs.
- For a tried-and-reverted measured candidate, keep the closeout ledger explicit even when no runtime diff remains:
candidate, why tried, measurement, why rejected, revert/cleanup confirmation, retained artifact or report, and final proof lanes. Confirm git diff -- <candidate-owned paths> is empty or contains only intentional non-candidate artifacts before terminal status.
- Before launching a profiling or measurement command that may run silently for minutes, choose and record a bounded stop/progress plan in working notes: expected first-output or completion window, timeout or manual-stop threshold, whether instrumentation perturbs the metric, and the fallback smaller probe if the command does not return. Do not discover the bound only after an open-ended run has already consumed the feedback loop.
- A mathematical lower-bound proof can be a decisive red-gate measurement only when it preserves the ticket-owned seam and makes the target impossible even under optimistic assumptions. Before using one, record the exact formula, sample source, why the sampled unit is a valid lower bound for the whole corpus, whether any unsampled class could be faster enough to change the verdict, and why a full corpus rerun would not change the red/green decision. If this replaces an explicit full-corpus, exact-seed, or named-command deliverable, treat it as a witness-deliverable correction and use
1-3-1 unless already authorized. If accepted, update the active ticket/spec before final proof and label the full run as historical, superseded, or still required.
For measured decision tickets whose truthful result is respec-only completion rather than a retained optimization or successor, make that closure explicit before terminal status. Use this compact worksheet: retired proof surface, replacement evidence, why no code/topology change is retained, why no successor is needed, retained code/report diff, materiality verdict, dependency/spec edits, and terminal-status basis. The ticket/spec must stop presenting the retired gate as active, and final proof should validate the retained artifact plus dependency integrity rather than rerunning a disproven expensive gate by inertia.
For inventory, audit, or fixture-producing tickets, verify the live ownership unit before building the deliverable. A ticket may name a broad semantic surface (march, event-card action, policy profile, etc.) while the real repo-owned boundary is finer-grained (actionPipeline id, card side, phase variant, emitted report row, or another runtime-owned artifact). Build the deliverable against the finest truthful live unit rather than collapsing distinct surfaces into the draft ticket's coarser prose.
For static inventory, registry, enum, feature-table, or list-completeness tickets, mechanically reconcile prose counts against the authoritative list before closeout. Count source constants, inventory rows, checklist entries, and spec/ticket prose labels such as "18 kinds" or "all N rows"; update stale counts before final proof so the durable outcome cannot disagree with the actual list.
For supported/fail-closed inventory tickets, make the classification evidence authoritative before closeout. Prefer computing inventory rows from the live runtime, ABI, compiled artifact, or validator seam. If a row is static transcription, pair it with focused proof that exercises the named seam and label it as transcription rather than independent runtime evidence. Before final proof, keep a compact before/after classification ledger: baseline rows, final rows, changed classes, unchanged residuals, and successor owner.
When an inventory ticket both lands a generic substrate and classifies the current corpus as supported versus fail-closed, keep a compact outcome ledger so the closeout cannot be mistaken for production routing. Include: supported subset, fail-closed classes, inventory command, activation/unsupported counters, parity witness, immutability or safety witness when relevant, successor owner, and final proof lanes.
For conformance, representative-corpus, or architectural-witness tickets, verify whether each named representative family actually exercises the ticket-owned runtime seam before treating the draft fixture shape as mandatory. If one production family exercises the active path and another only reaches a non-applicable or exit path, prefer a truthful split: active-path witness where live production states exercise it, non-applicable/exit witness where they do not, and a ticket/spec rewrite after 1-3-1 when the original deliverable explicitly required active-path coverage in the non-exercising family. If one representative coverage gap is found, run a compact inventory for the whole claimed coverage surface before proposing or applying the boundary reset, so the reset addresses the full mismatch class. Do not mutate production GameSpecDoc data or add game-specific engine behavior solely to manufacture a representative-family path; use a generic synthetic fixture only when the corrected boundary still requires an engine-generic second active-path witness.
- When the draft names an exact seed/turn/corpus size but live runtime makes that exact simulator corpus disproportionate, do not silently narrow to a cheaper proof. Preserve the architectural invariant first: stop for
1-3-1, propose a split such as fast sentinels plus a slow-lane corpus or direct public-seam corpus, and update the active ticket before final proof if the user confirms. The final shape should keep F16 automated proof while respecting F10 bounded feedback.
- When a staged proof harness is checked in before a sibling module or runtime path exists, keep the current ticket's proof active and gate only the future-owned branch. Avoid static imports of not-yet-landed modules; use the live sibling-owned activation flag, dynamic loading or existence checks when needed, and an explicit skip/deferred message for the future branch. Record the sibling owner and activation command in the active ticket before final proof.
- Replacing a ticket-requested live simulator witness, exact seed, exact fixture, or checked-in corpus artifact with a synthetic trace, direct public-seam construction, or smaller surrogate is an explicit witness-deliverable correction. Stop for
1-3-1 unless the user has already authorized that class of substitution; if confirmed, update the ticket/spec before final proof with why the live witness was unavailable or disproportionate and what invariant the replacement proves.
For small test-only regression tickets whose owned deliverable is one new or modified test plus minor docs/ticket closeout, use the bounded local refactor fast path when live reassessment shows no blocking drift, no schema/fixture regeneration, and no wider shared-contract fallout. Still apply production-proof checks for stale examples, correct suite-family placement, compiled-test command shape, and acceptance-command reconciliation.
- For generator/protocol tests where the owned invariant is the generic loop, iterator, publication, or termination shape rather than multi-turn gameplay progression, keep representative production fixtures bounded to the smallest truthful seam. A production
GameDef with maxTurns: 0 or an equivalent immediate-exit configuration can be a valid representative when it proves the production artifact boundary, while a separate synthetic fixture exercises active player-step progression. Record this as a proof-shape correction if the ticket draft expected a longer representative run.
When a ticket otherwise looks bounded but changes a serialized trace/result shape, generated schema, exported union, or required diagnostic field, classify it as mixed bounded local refactor + shared-contract rather than pure bounded-local. Do an early cross-package rg for the changed field/type/literal, list runner/UI/report/fixture consumers, and plan a workspace-level build or typecheck lane before closeout.
- For required trace/result field additions, run a mirror map before broad proof: sibling field model, every runtime producer or candidate-metadata builder, schema source such as Zod/type mirrors, generated schema artifact, hand-authored trace/JSON fixtures, and exact-shape JSON schema or trace-shape tests. Treat generated-schema drift as owned until a schema-artifact check proves otherwise, and record the touched-file/proof fallout in the active ticket before terminal closeout.
For closed trace/diagnostic enum extensions, treat source constants and trace-schema mirrors as one contract before coding. Use a compact mirror sweep: exported as const or source enum, kernel trace union/type mirror, Zod/schema source such as schemas-core.ts, generated JSON schema artifact, and the nearest schema-shape or trace-shape test. If the ticket draft claims "no consumer change required" but the mirror sweep finds one of these surfaces, record it as owned shared-contract fallout in working notes and the active ticket before final proof.
For pure compiler or validator diagnostic-code renames, keep the Foundation 14 no-alias proof explicit without over-expanding into unrelated schema work:
- rename the canonical code key and string value together
- update the emission site, focused assertions, and any source/test fixtures that consume the literal
- run a source/test sweep for the retired literal and require zero hits in owned mutable code and tests
- classify remaining hits in active specs/tickets, archived tickets, reports, or docs as either
historical/explanatory allowed or active-contract drift to update
- record in the active ticket whether generated schema/artifact fallout is owned; if the diagnostic code is not part of a generated public schema, say so rather than regenerating by inertia
If the ticket only changes population of an existing serialized field and does not change field name, required/optional status, enum values, property type, schema source, or generated artifact shape, classify it as existing serialized-field population. Still validate the serialized consumer path with focused tests and schema/artifact checks when nearby, but do not assume artifact regeneration is owned unless a generator or schema check reports drift.
Bounded Local Refactor Fast Path
When ticket triage confirms a bounded local refactor, load references/bounded-local-refactor.md for the lean 9-step path. Still emit the full working-notes checkpoint before coding and still perform the final acceptance sweep before closeout.
Use the same fast path for compact docs-plus-fixture tickets when reassessment shows no behavior, schema, generated artifact, or golden fallout. Still classify explicit cookbook/manual-preview deliverables, canonical fixture/helper placement, new-vs-modified test expectations, and any package-build-to-dist proof ordering before coding. If the ticket asks for a GUI or rendered markdown preview but no repo checker or GUI workflow is available, record the non-GUI source-review substitution in the active ticket before final proof.
Phase 1: Read and Understand
- Read
docs/FOUNDATIONS.md before planning or coding.
- Read the ticket file(s) matching the provided path or glob.
- If the supplied path is missing, search for the nearest active ticket by normalized ticket id or stem before widening scope. If exactly one plausible replacement exists, use it and record
ticket entry correction: <requested path> -> <resolved ticket> in working notes; if resolution is ambiguous, stop and clarify.
- Read referenced specs, docs, and
Deps. Read AGENTS.md and respect worktree discipline (all reads, edits, greps, moves, and verification commands use the worktree root when the ticket lives under .claude/worktrees/<name>/).
- If equivalent
AGENTS.md instructions are already in session context, rely on that context but still prefer the file when repo-local details might differ or the ticket references on-disk policy.
- If the user explicitly points at rulebooks, rules directories, production game data, or other rule-authoritative assets, verify early whether the live bug may belong in authored spec/policy data rather than assuming the fix is engine-only. In this repo, game-backed regressions can widen from runtime code into
data/games/<game>/... or similar GameSpecDoc-owned surfaces while still remaining the truthful ticket boundary.
- If a referenced spec or sibling ticket is already dirty from concurrent work and the active ticket does not require a parity correction there, classify it as concurrent state, call it out in working notes or commentary, and leave it untouched.
- Inspect repo state (e.g.,
git status --short) early. Call out unrelated dirty files, pre-existing failures, or concurrent work so your diff stays isolated.
- Extract all concrete references: file paths, functions, types, classes, modules, tests, scripts, and artifacts the ticket expects.
- When a draft or recently edited ticket names specific files, prefer a quick path-validation pass (
rg --files, targeted find, or equivalent) before opening the file directly if there is any sign of path drift.
- When the ticket retires or removes a public contract, literal, or descriptor shape, do one early same-package fallout sweep for that retired surface before locking the owned boundary. Prefer a fast
rg over the affected package so same-package tests, fixtures, and presentation layers do not surface only after the first broad proof lane.
- When moving or extracting canonical logic, grep for tests that read source files or assert source text (
readFileSync, source guards, AST-policy tests, contract-style grep tests). Update those guard expectations before broad proof so location-sensitive invariants follow the new canonical owner instead of surfacing late as broad-lane fallout.
- When a ticket names an existing source file that is already near or over repo file-size guidance, and the intended change adds separable logic, prefer a small adjacent helper/extraction before the first implementation edit. If that extraction would be nontrivial, widen ownership, or obscure the ticket seam, record the file-size risk in working notes and keep the final-proof file-size check mandatory.
- When
Files to Touch, acceptance criteria, or verification text names logs, reports, traces, results.tsv, lessons.jsonl, campaign outputs, generated artifacts, or other durable evidence files, run the early tracked/ignored delivery-path check before the working-notes checkpoint. Record whether each artifact will be checked in, transcribed into the ticket/report while the raw file remains ignored, or requires a narrow ignore-rule change. If the delivery path differs from an explicit ticket promise, use the repo 1-3-1 flow before editing.
- When tests need to exercise package
scripts/*.mjs, verify whether the repo build actually emits those files into dist/ before writing dist-relative imports or assertions. In this repo, tests may need to resolve the source script from package root instead.
- When compiled tests read non-TS resources such as JSON fixtures, markdown, corpus files, reports, or other sidecar artifacts, verify whether the build copies those resources into
dist/. If the build does not emit them, read the source fixture through an existing helper or a source-tree-relative path instead of assuming dist/test/... contains the sidecar file.
- For every new module or test you expect to add, decide explicitly whether imports should come from the public package surface or an internal file path. Verify the required export surface before coding rather than discovering it during the first build.
- When a ticket names a private helper as the proof target, prefer proving the behavior through the nearest public runtime seam instead of exporting the helper only for the test. Correct the ticket wording if the public seam is the truthful contract.
- When a ticket explicitly requires a new or modified test to exercise a named public API, alias, wrapper, or runtime entrypoint, map that witness before coding:
ticket-promised witness API, actual assertion entrypoint, lower-level helpers used, and why this still proves the public contract. A lower-level helper assertion may be useful support, but it is not a substitute for the named public seam unless the ticket is corrected or the user authorizes the substitution.
- When a new test depends on runtime-generated identifiers (for example
DecisionKey, bind-expanded names, dynamic branch ids, setup-created token ids, lookupRefKey / hashed lookup ref ids, generated resolvedRefs keys, or similar kernel-owned identity surfaces), do not assume the draft spec or hand-written fixture literals match the live canonical form. Prefer deriving those identifiers from the real runtime seam first and then asserting against that observed canonical sequence.
- When a layout, scaffold, index table, or feature-table ticket asks for instance identifiers, verify whether each id domain is declared in
GameDef, compiler-lowered from GameSpecDoc, or runtime-generated through setup/initialization. Derive the durable layout or test from the authoritative seam instead of inventing a parallel source contract.
- When a new conformance or regression test depends on production states reaching a specific runtime seam, run the smallest live witness-discovery probe before writing durable fixtures: build first if the probe consumes
dist/, exercise the nearest public runtime API, search only enough seeds/steps to classify whether the seam is active or non-applicable, and record the probe as reassessment evidence rather than as the final proof. Encode the durable test through normal repo helpers after the boundary is settled.
- Bound the probe before launching it: choose a small candidate set, wrap slow commands in an explicit timeout or equivalent stop condition, state the expected first-output/completion window, and name the fallback proof shape if no cheap live witness appears. Production
runGame probes can be slow and silent, so do not discover the bound only after a broad seed/turn sweep is already running.
- When the ticket names wildcard acceptance checks or
returns empty grep lanes, validate those patterns against the live repo early, especially if they span files outside the owned Files to Touch slice. Do not defer repo-wide empty-match assumptions until after coding.
- If that early validation shows the ticket's literal pattern is broader than the real owned invariant, stop treating the draft pattern as authoritative. Decide the narrowest truthful live boundary up front, then carry that corrected proof description into working notes and the active ticket closeout.
- Sanity-check ticket-named verification commands against live repo tooling before relying on them later.
- For bounded local refactors with straightforward verification, a light command-sanity pass is enough at this stage.
- When the command is a package-manager script target, prefer verifying the script definition and underlying runner entrypoint before using
--help or ad hoc flags as a probe.
- When a ticket names a root
pnpm turbo <task> command, verify that <task> exists in the root turbo.json task graph before treating it as runnable. If the task only exists as a package-local package.json script, record the package-filtered substitution (for example pnpm -F <package> run <task>) in working notes and the active ticket before final proof.
- When the owned change is a new test file only, verify how the acceptance lane discovers files before assuming the ticket's named command covers the new witness. Check whether membership is explicit, manifest-driven, or directory-derived, and record any resulting command or lane-coverage correction before closeout.
- When editing a test lane manifest, validate both inclusion and exclusion. For this repo's
packages/engine/scripts/test-lane-manifest.mjs, check the target lane plus all relevant predicate families: default/core, game-package, FITL-rules, FITL-events shards, Texas-cross-game, e2e, determinism, policy-profile-quality, and slow-parity shards. Prefer a cheap manifest-list/probe or a narrow wrapper run before paying for the full expensive lane, and record any lane-routing correction in the active ticket before final proof.
- When a focused proof consumes
dist/ or another regenerated build output, do not run that proof command in parallel with the build or artifact-generation step that is mutating the same output tree. Finish the producer step first, then run the consumer against stable output.
- When a focused command passes explicit file paths or patterns to a lane wrapper, verify whether that invocation preserves the wrapper's normal timeout, sharding, reporter, and sequential/batched semantics. If explicit-path mode drops the timeout or otherwise changes the proof contract, use a bounded wrapper such as
timeout, record the substitution in the active ticket, and do not infer the original lane's budget behavior from the focused command alone.
- Load
references/verification.md now when the command sanity check itself is nontrivial or already reveals output contention, stale-runner drift, or tracked-vs-draft correction work that needs the fuller guidance. If the ticket category or expected proof shape independently requires verification.md later (for example non-bounded work, shared outputs, multi-lane acceptance proof, migration fallout, or environment/tooling ambiguity), record verification.md required before final proof in working notes instead of treating this command-sanity fast path as a permanent skip.
Session, Series, and Draft Context
Load references/implementation-general.md for session continuity, series-slice discipline, named fallout classification, the active draft series sanity check, and ticket re-entry classification after a prior follow-up split when the ticket is not a bounded local refactor. For bounded local refactors, defer this load unless reassessment reveals split ownership, sibling drift, reopened follow-up context, or another concrete need for the broader series guidance.
Load references/closeout-and-followup.md as soon as a tracked ticket is expected to receive a durable outcome block, touched-file correction, or terminal status update. Do not wait until after implementation when the ticket has explicit deliverable lists or repo-local status transitions that need to shape the final proof plan.
When the active ticket names a sibling ticket as a condition for a deliverable, search both active tickets/ and archived ticket roots such as archive/tickets/ before deciding whether that condition has fired. Record the resolved sibling state in working notes when it changes the owned closeout work.
Load references/draft-handling.md when the active ticket or referenced artifacts are untracked drafts, or when a tracked ticket appears stale, and draft status creates real reassessment or ownership ambiguity. For bounded local refactors, untracked-draft status alone does not force this load if the active draft can be kept honest through working notes, direct reassessment, and durable closeout updates.
Phase 2: Reassess Assumptions
- Verify every referenced artifact against the live codebase with targeted reads and
rg. Load references/triage-and-resolution.md (Artifact Verification Checklist section) for what to validate — file existence, exports/signatures, callsite ownership, claimed dead fallbacks, widened compilation families, and auto-synthesized outputs.
- If those checks show the ticket's named code/test/module slice is already landed on live
HEAD, explicitly classify the run as verification + truthful closeout rather than fresh implementation. From that point, the owned work is to validate the proof surface, confirm whether any cited blocker still reproduces, and rewrite ticket/spec/sibling artifacts before claiming completion or retaining BLOCKED.
- Build a discrepancy list and classify each item per
references/triage-and-resolution.md (Stale-vs-Blocking Triage). When legality/admissibility and sampled completion surfaces disagree, follow the Legality/Admissibility Contradiction Playbook in that reference before widening retries, adding fallbacks, or rewriting the boundary.
- For proof, benchmark, audit, regression, or invariant-locking tickets, explicitly check whether any named warning, rejection, event, or failure surface is the architectural invariant itself or only one manifestation of it. If the live code preserves the broader invariant through a different layer or rejection surface, stop and reconcile the ticket/spec before changing production code just to force the named symptom surface.
- When a broad acceptance lane fails or stalls inside a corpus that repo doctrine already classifies as advisory, non-blocking, or separately owned (for example via
docs/FOUNDATIONS.md, lane-taxonomy tests, or CI workflow intent), verify that ownership before treating the surfaced file as a production-fix or harness-fix requirement. If the repo doctrine says the corpus should not block the owned ticket, prefer a truthful proof-boundary correction over repairing the advisory witness just to preserve the stale lane shape.
- When an upstream result can be reclassified downstream (for example
completed becoming a rejected or dead-end candidate later in the pipeline), verify that the ticket-owned diagnostic payload or invariant survives that handoff before changing retry policy, adding fallbacks, or rewriting the ticket boundary. Do not assume the first result surface is the only place the owned invariant must remain observable.
- For microturn publication, recovery, rollback, fallback, or pass-fallback tickets, run an explicit one-rules-protocol parity sweep before treating the first fix as complete. Check the relevant surfaces together:
publishMicroturn, applyPublishedDecision, applyMove, applyTrustedMove, legalMoves, enumerateLegalMoves, probeMoveLegality, and probeMoveViability. When the ticket/spec proposes direct applyMove / applyTrustedMove for an incomplete action-selection or continuation move, validate the first live transition seam against publishMicroturn + applyPublishedDecision before locking the implementation shape. The invariant should survive publication, raw/classified enumeration, direct apply, trusted apply, and probe/admissibility paths; otherwise broad parity lanes can expose the same bug after the focused seam is green.
- For chooseN/per-option preview tickets, classify the live continuation after each root option before locking the driver shape:
same chooseN with legal ADDs, same chooseN completion-only/confirm-only, or different microturn/result. Record whether completion-only continuation is handled by an existing inner-completion path or changes the ticket-owned contract; use 1-3-1 if that classification changes an explicit deliverable rather than clarifying the live algorithm.
- For policy-preview or agent-preview tests over production incomplete action-selection moves, do not assume
enumerateLegalMoves entries will carry a trustedMove. If the production surface publishes only the move plus viability, exercise the preview/runtime classification fallback rather than constructing a synthetic trusted-move index unless the ticket specifically owns trusted-move production.
- For investigation tickets that specify how to measure something, explicitly check whether the ticket's proposed probe method exercises the same live semantic seam as the subsystem being characterized. If the requested method and the live kernel/runner/agent seam differ, stop and reconcile that before generating durable evidence artifacts.
- Explicitly check that each ticket/spec-required key input, identifier, or artifact is actually owned by the module boundary you are about to change. If a requirement depends on data that this seam does not legitimately receive or control, stop for 1-3-1 before coding rather than widening the API or silently weakening the requirement ad hoc.
- Check constraints the ticket may have underspecified. Load
references/schema-and-migration.md (Reassessment Surfaces section) for the full shared-contract / cross-package / fixture / test-harness / rulebook / repro-reduction checklist.
- When the contradiction is specifically a stale witness input rather than a production-code bug, classify that separately from ordinary scope drift. If the user authorizes re-blessing, prefer replacing the witness with the narrowest validated live witness instead of widening semantics just to preserve the old example.
- For bounded local refactors in this repo, if you add or rename a warning/error/event/schema literal, immediately check shared type unions,
schemas-core definitions, and generated schema artifacts before treating the change as purely local.
- If you add, rename, or make required a serialized trace/result field, diagnostic outcome, generated-schema property, or exported union member, immediately search downstream package fixtures and consumers for hand-authored object literals or exhaustiveness assumptions before the first broad proof lane. Treat runner, UI, report, trace-summary, and golden-fixture fallout as shared-contract fallout, not surprise late cleanup.
- For mutable caches or memo tables, decide
sharedStructural versus runLocal explicitly instead of treating mutability alone as decisive. Verify: the cache key universe is bounded by the compiled artifact; cached values are pure functions of structural inputs; sharing cannot change cross-run semantics; and fork/reset is required only if one of those proofs fails.
For investigation-ticket-specific reassessment rules (minimal witness probe before durable artifact generation, long-running measurement narrowing, command-reproducer vs artifact-capture separation, observer-effect check, large-fixture derivation), load references/specialized-ticket-types.md (Investigation Ticket Reassessment Patterns section).
Load references/triage-and-resolution.md when discrepancy classification is nontrivial, when the ticket is not a bounded local refactor, or when reassessment reveals boundary-affecting drift that would benefit from the fuller taxonomy. A bounded local refactor may skip this load if the discrepancy handling remains straightforward and is still recorded explicitly in working notes.
If the change involves a mid-migration state or ticket rewrite, load references/schema-and-migration.md (Migration & Rewrite Awareness section).
- If correcting one ticket changes ownership within an active series, load
references/implementation-general.md (Series Consistency section) and follow the sibling coherence rules.
- If the active ticket absorbs work originally owned by sibling draft tickets, plan the sibling-ticket status rewrite as part of closeout, not as optional cleanup after acceptance. The series artifact should tell the same ownership story as the final code and proof set.
- If reassessment instead proves that the remaining implementation work belongs to a sibling ticket rather than the currently active one, stop and make that handoff explicit. Restate the successor owner, confirm the user has authorized that boundary change when required, update the successor and any affected sibling/spec artifacts before more code changes, then emit a fresh working-notes checkpoint and continue under the successor ticket's proof surface.
- If the active ticket's corrected live contract changes the interface, call shape, touched-file expectation, or verification assumption used by dependent active tickets in the same series, update those dependent tickets in the same turn before final proof so the active series remains internally consistent.
- If that same boundary correction invalidates design language, assumptions, or the ticket list in an active spec, update the active spec in the same turn before final proof so tickets and specs stay parity-aligned.
- If the active ticket uncovers a broader architectural gap that extends beyond the owned implementation slice but is now evidenced concretely by live code, tests, or rules artifacts, do not leave that discovery implicit. Propose or draft the narrowest truthful follow-on spec/design artifact before final closeout when the user wants series artifacts kept current, or record it explicitly as required follow-up ownership when the user prefers to defer spec work.
- Treat this as an architecture-gap extraction case rather than ordinary ticket sprawl when the local fix is valid but the session proved a missing cross-ticket contract such as runtime cache ownership, terminal-phase semantics, or another boundary that should govern future tickets as well.
- For staged sibling-series slices where the current ticket removes an invocation/call path and a later sibling owns deleting the now-unused helper/body, keep the grep/proof invariant scoped to the current ticket's owned call path, not the successor-owned symbol definition. Make any intentionally retained dead code lint-safe with a narrow temporary suppression or equivalent comment naming the successor ticket; if repo policy or
FOUNDATIONS.md would forbid that temporary state, stop for 1-3-1 before either deleting sibling-owned code or leaving an unlintable implementation.
- If stronger live evidence contradicts an archived sibling ticket's benchmark or investigation verdict, load
references/triage-and-resolution.md (Archived Sibling Contradiction section) and classify the contradiction explicitly before coding.
Phase 3: Resolve Before Coding
Load references/triage-and-resolution.md (Stop Conditions and Boundary Resets section) for the full resolve-before-coding discipline: stop conditions (factually wrong ticket, unverifiable bug claim, scope gaps, semantic acceptance drift), 1-3-1 workflow, authoritative-boundary restatement, rewritten-clause sanity check, proof-shape classification, partial-completion/new-blocker handling, and acceptance-lane blocker classification.
For boundary-reset decision paths that apply after a bounded slice lands but the acceptance story shifts (broad acceptance lane classification, moved live blocker, diagnostic narrowing loop, non-implementation boundary rewrite cleanup, same-ticket widened re-entry, successor ticket re-entry, post-closeout reopen), load references/boundary-reset-recovery.md.
STOP before coding: emit the compact working-notes checkpoint now, using the checklist under ## Working Notes. Do this after reassessment has identified the authoritative boundary and before the first file edit, even when the boundary is straightforward.
Implementation Rules
Load references/implementation-general.md by default for non-bounded tickets, and for bounded local refactors only when the ticket widens beyond a simple local change, exposes split ownership/follow-up handling, or otherwise needs the broader implementation guidance. Covers general principles, TDD for implementation-discovered defects, narrowest-witness preference, bounded campaign reductions, diagnostic instrumentation, follow-up ticket creation, exploratory-diff classification, named-witness regression loop, representative-corpus preflight, same-ticket widened continuation, synthetic fixture setup, regression placement triage, and direct fallout test triage.
If the ticket is a mechanical refactor, compiler diagnostic, gate/audit, investigation, groundwork, or production-proof/regression ticket, load references/specialized-ticket-types.md.
For profiling or benchmark red-gate tickets, use a fast reference set unless reassessment triggers heavier guidance: references/working-notes.md, references/ticket-type-triage.md, references/specialized-ticket-types.md for gate/audit/profiling guidance, and references/verification.md for measured-gate and CPU-profile handling. This profiling fast path overrides the default non-bounded-ticket load of references/implementation-general.md; load implementation-general.md only when one of the heavier-guidance triggers below appears. Load references/closeout-and-followup.md before creating/updating any successor, dependent-ticket rewrite, spec ticket-list update, status transition, or other ticket-graph closeout artifact. Ticket-graph closeout alone does not require implementation-general.md when closeout-and-followup.md covers the handoff. Load broader references such as references/implementation-general.md, references/triage-and-resolution.md, references/schema-and-migration.md, or references/verification-acceptance-proof.md only when triggered by split ownership, nontrivial discrepancy, shared contract/schema fallout, noisy harness behavior, command-wrapper ambiguity, or post-proof invalidation.
If the change touches schemas, contracts, goldens, or involves a migration, load references/schema-and-migration.md. Covers in-memory vs serialized decisions, post-migration sweeps, identifier consumer sweeps, interim shared-contract state for staged tickets, and historical benchmark worktree handling.
For serialized trace/result shape migrations, use this compact checklist before the first broad proof lane:
- update the authoritative source type or union and every direct writer of the field
- update schema source and regenerate/check the generated schema artifact
- search direct readers, hand-authored object literals, report/diagnostic consumers, fixtures, goldens, and exhaustiveness assumptions
- include downstream package consumers in that search. For example, engine trace-shape changes can require runner trace fixtures/subscribers, UI trace consumers, report fixtures, JSON-schema tests, and golden trace files even when the ticket is otherwise engine-focused.
- prove any intended trace-tier boundary such as verbose-only emission or summary omission
- add or update a focused replay/determinism witness when the field is serialized, ordered, or seed-sensitive
- include at least one proof lane that consumes the generated output or public serialized contract, plus the package/workspace typecheck lane when downstream consumers can see the shape
Verification
Before the final acceptance-proof pass, pause on the explicit checkpoint from the Final-Proof Gate above: Will the active ticket artifact change after this proof lane? If yes, update the ticket first and only then run the final acceptance-proof set. For active draft tickets, treat the Gate as mandatory, not advisory — the ticket artifact should already be truthful before the first proof lane you plan to cite as final acceptance. For expensive benchmark/evidence tickets, use references/verification-acceptance-proof.md's transcription-edit distinction: recording already-run metrics may require consistency and hygiene checks rather than rerunning the expensive measurement, but edits that change status, thresholds, acceptance boundaries, or proof claims remain proof-affecting.
When an acceptance check requires a temporary negative edit to prove a lint rule, config guard, validator rejection, or similar fail-closed path, make that choreography explicit before running it: name the temporary file/edit, run the expected-fail command, capture the diagnostic text that proves the owned guard fired, revert the temporary edit immediately, verify the temporary path no longer appears in git diff / git status --short, rerun the affected clean lane, and record the red-then-clean sequence in the active ticket outcome. Treat the expected-fail lane as proof of the guard behavior, not as a red final lane, only after the revert and clean rerun are complete.
Load references/verification.md for non-bounded tickets, or for bounded local refactors once verification planning becomes nontrivial because of shared outputs, multi-lane acceptance proof, migration fallout, or environment/tooling ambiguity. Covers command sanity check, verification preflight, execution order, build ordering and output contention, verification safety, escalation ladder, failure isolation, schema & artifact regeneration, standard commands, and measured-gate outcome. In the working-notes reference guidance loaded line, explicitly answer verification.md required? yes/no; when yes, load it before the final-proof gate even if early command sanity stayed simple.
For performance, profiling, or audit gates, a green TAP/process exit is not enough if the command does not assert or print the ticket-owned metric. Confirm that the lane reports the concrete value, threshold, and verdict the ticket needs; otherwise run or create a repo-owned measurement path before closeout and record that command substitution in the active ticket.
- For ABI, routing, or hot-path handoff tickets, confirm before the decisive profile that the harness reports the activation diagnostics needed to prove the owned route was actually exercised: route counts, unsupported counts, compile/cache counts, bucket ownership, or equivalent. If those counters are missing, add the smallest generic instrumentation first, run a syntax/focused proof for the harness, and only then run the profile you intend to cite as final.
- When adding diagnostic counters or helper exports for profiling proof, classify the export surface before coding as
public, test/internal, or script-only. Prefer the narrowest internal or script seam; if a package barrel or kernel index makes the counter reachable from a wider surface, record why that widened path is acceptable and include the classification in the active ticket outcome.
- For binary/WASM/FFI tickets, grep host and target ABI identity constants before the first broad proof whenever you change magic, version, layout, or export shape. Keep the TypeScript/Rust (or equivalent host/target) constants synchronized before interpreting runtime-load failures as deeper ABI defects. After an ABI identity or export-shape change, run the proof pair that catches both classes of drift: rebuild the target artifact, rebuild the host package if it consumes generated output, run the focused feature/parity test, and run the runtime loader or ABI validation test that would fail on magic/version/layout/export mismatch.
- For V8 CPU profiles, a compact triage pass is enough before deciding ownership: parse the
.cpuprofile, rank self-time samples by function/file, and compare the top samples to the ticket-owned seam. Use this to distinguish "the owned hot path still dominates" from "the owned path was removed and a different root cause now dominates" before proposing a split or further optimization.
- Prefer the bundled parser when available:
node .codex/skills/implement-ticket/scripts/parse-cpuprofile.mjs <profile.cpuprofile> --targets fnv1a64,resolveRef,evalCondition.
It reports top self-time functions, top files, and parent-stack ownership for named targets. Use ad hoc node -e snippets only when the parser is missing or needs a one-off grouped percentage not yet supported.
- Shell-safe parser recipe for quick triage:
node -e "const fs=require('fs'); for (const file of process.argv.slice(1)) { const p=JSON.parse(fs.readFileSync(file,'utf8')); const idToNode=new Map(p.nodes.map(n=>[n.id,n])); const parent=new Map(); for (const n of p.nodes) for (const c of n.children||[]) parent.set(c,n.id); const self=new Map(); for (const id of p.samples||[]) { const n=idToNode.get(id); const name=n?.callFrame?.functionName||'(anonymous)'; self.set(name,(self.get(name)||0)+1); } console.log('\\nFILE '+file); console.log([...self].sort((a,b)=>b[1]-a[1]).slice(0,15).map(([k,v])=>k+':'+v).join('\\n')); }" <profile.cpuprofile>
For parent-stack ownership, extend the same script by filtering samples for the hot function and counting parent.get(sampleId) function names. Avoid shell template literals in node -e snippets; use string concatenation so backticks are not eaten by the shell.
- For tickets whose acceptance is a named aggregate bucket rather than one top frame, also compute a grouped percentage. Use exact function/file matchers that mirror the ticket wording, report
selected / total = pct, and record the parser method in the ticket outcome. Example shape:
node -e "const fs=require('fs'); const groups=[['kernel-query',['resolve-ref.js','eval-condition.js','eval-value.js','eval-query.js','spatial.js','token-filter-compiler.js']]]; for (const file of process.argv.slice(1)) { const p=JSON.parse(fs.readFileSync(file,'utf8')); const idToNode=new Map(p.nodes.map(n=>[n.id,n])); const total=(p.samples||[]).length; console.log('FILE '+file); for (const [label,files] of groups) { let selected=0; for (const id of p.samples||[]) { const n=idToNode.get(id); const url=n?.callFrame?.url||''; if (files.some(f=>url.endsWith(f))) selected++; } console.log(label+'='+selected+'/'+total+' pct='+(selected/total*100).toFixed(2)); } }" <profile.cpuprofile>
For small CPU-profile percentage shifts, classify the result explicitly instead of implying a wall-clock win: record the selected sample counts, denominator, percentage delta, wall-clock direction, parser method, and verdict (material, small but accepted for owned bucket, or noise/unproven). Require repeated runs before claiming wall-clock improvement or gate unblocking; one same-seam profile may be enough only for a narrowly stated sample-bucket reduction or ownership classification.
- When a CPU-profile result changes ownership or justifies a follow-up split, record a compact ownership ledger in the active ticket before final proof:
profile command
profile artifact path or explicit ephemeral-path note
parser command or parser method
baseline metric
current metric
ticket-owned stack samples
residual stack samples
successor owner or explicit no-follow-up rationale
- When the CPU-profile result justifies a successor ticket, copy the evidence into that successor with:
profile command, profile artifact, parser method, top owners, and why this successor is non-overlapping.
Load references/verification-acceptance-proof.md for acceptance-proof discipline: acceptance command reconciliation, wrapper and child command isolation, silent-but-healthy lanes, broad-lane failure classification, post-proof-edit invalidation, focused recovery loop, unrelated failure vs owned regression, package script / runner widening, and final-proof choreography.
If a Node, package, or lane-wrapper test fails with sandbox-style child-process errors such as spawnSync /bin/sh EPERM, classify the first result as provisional environment failure unless the assertion itself is visible. Rerun the smallest failing lane with appropriate escalation or an equivalent unsandboxed proof, then rerun the ticket-named broad lane when it is part of final acceptance before citing the result. Record the sandbox rerun substitution in the ticket outcome or final closeout.
When a CI or workflow lane is ticket-owned, acceptance-critical, newly failing locally, or used to explain why CI is green while local proof is red, audit workflow gating before closeout. Search .github/workflows for continue-on-error and continue_on_error, classify each hit as Foundations-backed advisory, temporary tactical relief, or masking a required gate, and restore blocking semantics for required gates unless the ticket explicitly owns keeping the mask. If keeping or removing a mask changes an explicit ticket deliverable, stop for 1-3-1; otherwise update the active ticket/spec/sibling artifacts before final proof. Use a compact ledger when more than one workflow is touched:
workflow file | job/lane | continue-on-error kind | rationale source | decision | proof
Do not remove an advisory non-blocking lane merely because it is non-blocking when repo doctrine explicitly says it is advisory. In this repo, docs/FOUNDATIONS.md distinguishes blocking determinism proof from non-blocking policy-profile-quality witnesses; use that kind of source as the rationale when preserving a mask.
Before launching a ticket-named broad package/workspace lane (pnpm -F ... test, pnpm turbo test, broad node --test globs, or similar), choose a bounded stop/progress plan when the lane is likely to be slow/noisy: expected first output or heartbeat, timeout/manual-stop threshold, the narrower focused proof that preserves the owned witness if the broad lane stalls, and how a stall will be classified. For ordinary required broad lanes that are expected to emit regular progress and are not known noisy, commentary progress updates are sufficient while the lane is visibly healthy; record the full stop/fallback plan in working notes or the active ticket when the lane is silent, historically flaky, expected to run for minutes without heartbeat, used as a proof substitution, or likely to expose unrelated residual tails. When the lane is known or suspected to include expensive residual tails, do a cheap owner preflight before paying for the broad run: search active tickets/specs/reports for the likely file names, lane names, or warning labels, and record any already-documented owner or dependency before launching. For bounded local refactors, perform that preflight before ticket-named slow lanes even when the owned focused witness is already green; decide and record whether each broad lane is final acceptance, supplementary evidence, or requires a focused owned substitution before launch. If the residual owner is already documented and the current ticket has focused owned witnesses, plan the broad lane as supplementary/non-final unless the ticket explicitly requires it as the only acceptance proof. Do not discover the bound only after an open-ended broad lane has already consumed the feedback loop.
Focused proof lanes can still be slow or silent when they exercise FITL, production runGame, runGameSteps, random-play helpers, policy agents, or other simulator-heavy paths. Before launching one, do a compact pre-proof check: run it serially unless it is already proven cheap, wrap focused node --test or direct probe commands in an explicit timeout, state the expected first-output/completion window, and identify the fallback narrower proof shape if the lane stalls. Record whether a timeout is an owned failure, a stale witness, or a noisy/pre-existing proof limitation before using it in closeout.
For bounded local refactors, if the owned focused witness is green but a broader acceptance lane fails, rerun the first failing file or command directly before editing anything else. If the broad-lane failure is a timeout, repeated quiet-progress stall, or silent/noisy harness behavior, load references/verification-noisy-harness.md before the direct rerun and use a bounded single-file probe. Classify the result as owned regression, owned fallout, repo-preexisting unrelated blocker, or harness-noisy / not final-confirmed, then record that classification in the active ticket closeout instead of leaving the broad failure ambiguous.
If the failing broad lane was named by the ticket and the direct rerun exposes a changed-path failure, serialized-contract fallout, or architectural invariant touched by the ticket, the active ticket is not closeout-complete. Fix it, or use the repo 1-3-1 flow to get explicit authorization for a narrower durable state or sibling handoff before setting the ticket status to the repo-local terminal implementation wording.
When a proof command appears hung and normal session input cannot stop it cleanly, perform narrow cleanup before continuing: poll the session once, inspect matching processes with a precise pattern (pgrep -af or equivalent), terminate only the command/process group for the hung proof lane, confirm no matching child processes remain, and record the lane as harness-noisy / not final-confirmed or another truthful non-green classification. In Codex sandboxed shells, process probes may match the probe command or sandbox wrapper itself; when the result is ambiguous, use ps -eo pid,ppid,pgid,stat,command | rg '<precise pattern>' or another self-match-resistant check, and distinguish "only the current probe matched" from "the proof lane is still running." Before transcribing a killed or timed-out classification into the active ticket/spec, poll the original exec session one last time when possible; if it later emits output, treat that as late-returned diagnostic evidence, update the ticket/spec before terminal status, and rerun only the lanes affected by the changed evidence story. Do not leave runaway proof processes active, and do not treat a manually terminated lane as either passing or a functional red assertion without direct evidence.
If the user redirects scope while a long-running proof lane is active, decide explicitly whether that lane is still useful evidence. Let a non-contending proof finish only when it does not mutate outputs the new work will touch; otherwise terminate it with the same precise cleanup discipline, record the evidence as superseded or non-final, and avoid editing artifacts that the running command may still rewrite. When continuing with non-overlapping work, mention that classification in commentary or the active ticket so the final proof story does not inherit an ambiguous background run.
Load references/verification-noisy-harness.md for silent/noisy harness handling: standalone silent acceptance command, long-running package lane progress triage, owned witness preservation under harness-noisy lanes, single focused proof file silence, artifact hygiene during rerun, harness defect escalation triage, and durable closeout with mixed results.
For evidence states, trace-heavy ticket inspection, campaign/manual validation evidence, saved traces, harness summaries, decision-gap inspection, campaign metrics, and generated artifact triage, load references/verification-evidence.md.
Follow-Up
Load references/closeout-and-followup.md for non-bounded tickets, or for bounded local refactors when closeout needs follow-up classification, ticket blocking, sibling rewrites, or other nontrivial handoff work. Covers the closeout summary, final acceptance sweep, acceptance-proof invalidation rules, tracked-ticket durable outcome block, durable state classification (including repo-local terminal wording such as IMPLEMENTED, plus BLOCKED by prerequisite / PENDING untouched), state-transition ledger, draft-ticket durable closeout, touched-file scope sweep, correction ledger pattern, draft-ticket closeout order, dependency integrity pass, sibling absorbed ownership, and compact final-proof / investigation-ticket / evidence-ticket ledgers.
After implementation proof and terminal ticket status are truthful, hand off to $post-ticket-review for review cleanup, follow-up-ticket decisions, dependency/archive checks, and archival. Do not archive directly under implement-ticket unless the user explicitly included archival in the implementation request and the ticket's written outcome already matches what landed.
Final response must state whether post-ticket-review already ran. If it did not, say plainly that the ticket is implemented but not archived and name $post-ticket-review <ticket> as the next review/archive workflow.
Use this compact final handoff shape when implementation stops before archival:
implemented ticket: active path and terminal status
archive status: implemented but not archived, archived, or post-ticket-review already ran
tracked modified: tracked files changed by this implementation
untracked added: newly created files that git diff --stat will not show
green proof lanes: commands that passed and are final for the owned slice
classified red/non-final lanes: failed, advisory, skipped, or substituted lanes with ownership classification
next workflow: $post-ticket-review <ticket> unless archival already ran or the user explicitly asked to pause
Codex Adaptation Notes
- Replaces Claude-specific invocation arguments with normal Codex conversation context.
- Do not rely on Claude-only skills or slash-command behavior.
- Execute implementation directly once the ticket is verified and no blocking discrepancy remains.
- When resuming after context compaction, interruption, or a long handoff summary, do a compact revalidation before continuing: reopen the active ticket, referenced spec, and any successor/sibling artifacts touched since the last proof; run
git status --short; confirm whether any post-proof edits changed code, status, scope, command semantics, dependency ownership, or acceptance claims; rerun pnpm run check:ticket-deps if the ticket graph changed; and classify expensive measured lanes as still-valid transcription evidence or invalidated proof before citing them.
- If the handoff includes an active proof session or command session id, poll that session before starting any new command that could contend with its outputs. Record whether it is still running, passed, failed, stalled, or no longer observable. Do not launch
dist-, schema-, golden-, or generated-output-contending lanes until the prior producer/consumer has finished or been truthfully classified.
- After a resumed broad lane that rebuilt or cleaned shared output, rerun any earlier focused generated-output consumer proof you still intend to cite as final, or record why that proof is no longer part of final acceptance.
- If the user asks whether you are stuck, asks for status, or interrupts after an apparently hung command, answer before continuing implementation. Poll for precise leftover processes, state whether anything is still running, report the last completed proof or failure, and then continue only if the newest user message does not ask you to pause. Do not silently resume while the user is waiting for a status answer.
- When inspecting markdown from the shell, avoid unescaped backticks in search patterns; prefer plain-string anchors or direct file reads. During closeout sweeps, use single-quoted
rg patterns or remove markdown backtick fragments; never put markdown backticks inside double-quoted shell strings.
- For profiling or benchmark gate tickets, treat the ticket-owned harness/log/report surface as authoritative over exploratory single-run probes when the two differ.
Example Prompts
Implement tickets/LEGACTTOO-009.md
Implement the ticket at .claude/worktrees/feature-a/tickets/FOO-003.md
Implement tickets/FITLSEC7RULGAP-001*. Read dependent specs first and stop if the ticket is stale.