| name | cas |
| description | Run Zig CAS helpers (`cas`, `cas_smoke_check`, `cas_instance_runner`, `cas_review_session`, `cas_conformance_suite`) for v2 app-server smoke checks, direct thread/turn request execution, detached review-session control, multi-instance fanout, and swarm conformance checks around `$st` claim sets and retry policy. |
cas (Zig App-Server Control)
Overview
$cas is Zig-only in this repo.
Use the native cas dispatcher and subcommands:
cas conformance for swarm conformance checks around $st claims and retry policy.
cas smoke_check for protocol/API smoke checks.
cas instance_runner for method execution across one or many isolated instances.
cas review_session for detached review/start lifecycle control with persisted reviewThreadId handles.
run_cas_tool request (helper alias) for single-request flows via instance_runner --instances 1.
Current cas smoke_check verifies the native client can complete the v2 handshake and reach experimentalFeature/list, thread/start, thread/resume, and turn/steer.
Hook policy:
--hooks inherit is the default and lets the resolved Codex runtime load and run hooks normally.
--hooks off launches CAS-owned app-server processes with --disable codex_hooks; this is best-effort control for CAS-owned stdio and managed websocket transports, not authority over an externally supplied app-server URL.
--hooks require-observed keeps hooks enabled and fails closed with hook_not_observed if the lane captured no hook/started or hook/completed notifications.
hooks_unsupported means the resolved codex app-server --help surface did not prove the app-server can accept the hook control/observation contract.
- Bad observed hook statuses block the CAS lane with precedence
hook_blocked, then hook_failed, then hook_stopped.
- JSON outputs that observe hooks include
hookSummary; raw hook notifications are written only to local NDJSON paths referenced by hookSummary.hookLogPath.
Current cas conformance covers these swarm-hardening scenarios:
claim_safe_wave: verify two disjoint $st claims can run in parallel without overlapping lock roots
stale_claim_reclaim: verify expired held claims become stale and return to pending
overload_backoff: verify the bounded retry/backoff policy with a deterministic synthetic overload script
cas conformance is the harness; it is not the owner of durable claims. $st remains the source of truth for claims/runtime/proof metadata.
Current cas review_session is the review-control lane:
start launches detached review/start on a supplied or freshly created parent thread
start supports --parent-mode auto|fresh|reuse; reuse rejects unsafe or unmaterialized parent threads, and fresh-parent startup retries once after a bootstrap materialization turn when the installed codex needs it
start now prefers a CAS-managed loopback websocket app-server so detached review survives the launching CLI process and fresh-process status / wait / interrupt can reconnect to the same review session
wait is the primary completion path for persisted detached review handles and reconnects through the stored websocket session metadata when present
status reads the detached review thread from a fresh CAS process through the persisted transport metadata when that runtime path is supported
interrupt sends turn/interrupt for the persisted detached review turn
reviewThreadId is the recoverable handle. Session records live under ~/.codex/cas/review_sessions/, and CAS appends raw request/response artifacts to a per-review NDJSON log beside that record.
Review boundary:
- Use
cas review_session when you need detached lifecycle control: persisted reviewThreadId, explicit interrupt, compatibility diagnostics, or approval/runtime overrides on the detached lane. The websocket-backed lane is local-only and CAS-managed in this repo.
- If you only need a one-shot git-backed review verdict and do not need detached control, prefer native
codex review --base ... or codex review --commit ... instead of introducing CAS transport risk. First-party remediation flows should treat native review as the default path.
cas smoke_check is never review proof; it only proves handshake/method reachability.
cas instance_runner is never the production review lane; it is for method probing and schema sanity checks.
When start, start --wait, status, or wait emit JSON, the output includes the detached review handle/result fields plus launch compatibility metadata:
resolvedCodexPath
resolvedCodexVersion
compatibilityVerdict
selectedTransport
selectionReason
degradedFallback
managedServerPid
managedServerListenUrl
managedServerStderrLogPath
orphanTtlSeconds
failureCode
failureHint
reviewResultAvailable
reviewResultSource
reviewResult
findings
overallCorrectness
overallExplanation
overallConfidenceScore
fallbackUsed
fallbackTransport
fallbackExitCode
fallbackOutputText
fallbackErrorText
Use the fields this way:
compatibilityVerdict="compatible" means the detached review launch path succeeded under the resolved codex binary
compatibilityVerdict="incompatible" means CAS identified a detached-review runtime mismatch and failed closed
compatibilityVerdict="not_checked" means no compatibility verdict was persisted for that record yet (older session record or pre-launch failure)
selectedTransport="websocket" means CAS used the managed loopback websocket lane for the detached review session
selectedTransport="native-review" with degradedFallback=true means CAS preserved the review verdict by degrading to native codex review; it is not detached-control proof
failureCode="wait_timed_out" means retry cas review_session wait on the same reviewThreadId or increase --timeout-ms; it is not a successful review
failureCode="review_interrupted" means the detached review was interrupted before it emitted a structured review result
failureCode="approval_denied" means the detached review stopped on an approval or permissions denial before it emitted a structured review result
failureCode="review_failed" means the detached review failed or errored before it emitted a structured review result
failureCode="review_output_missing" means the detached review reached terminal state without a structured review result even though it was not classified as an interrupt or approval failure
failureCode="websocket_bootstrap_failed" means CAS could not launch or connect to the managed websocket app-server before detached review startup completed
failureCode="review_transport_lost" means a persisted websocket-backed detached review could not be reconnected and CAS had to fail closed or degrade to explicit native fallback
failureCode="hooks_unsupported" means the resolved Codex app-server does not support the requested --hooks policy
failureCode="hook_not_observed" means --hooks require-observed was requested and CAS saw no hook notifications in the observed lane
failureCode="hook_blocked", failureCode="hook_failed", or failureCode="hook_stopped" means Codex ran hooks and CAS failed closed on the worst observed hook status
failureCode="parent_thread_not_materialized" or failureCode="unsafe_parent_thread_state" means the supplied parent thread is not safe to reuse for detached review
fallbackUsed=true means --fallback native-review ran codex review and returned its raw text output instead of a structured detached-review result
Review result classification:
- Detached review success requires all of:
compatibilityVerdict="compatible", selectedTransport="websocket", fallbackUsed=false, reviewResultAvailable=true, and no blocking failureCode.
- Native-fallback success is a different class of result:
fallbackUsed=true means the review text came from native codex review, not detached CAS review. Report it as native fallback, not detached-review proof.
- Transport progress is not review success:
reviewThreadId creation, start --wait returning, or status showing a terminal turn is insufficient unless the result fields above classify it as success.
Compatibility note: on Codex 0.118.x stdio, detached review still requires fresh-parent materialization before detached review/start, so CAS --parent-mode auto keeps that pre-materialization path. The websocket-backed review lane exists specifically to restore truthful fresh-process detached control across commands instead of depending on stdio connection lifetime.
Node runtime paths (cas_proxy.mjs, cas_client.mjs, and related wrappers) are removed from this skill and must not be used.
This skill assumes codex is available on PATH and does not require access to any repo source tree.
Zig CLI Iteration Repos
When iterating on the Zig-backed cas helper CLI path, use these two repos:
skills-zig ($HOME/workspace/tk/skills-zig): source for the cas Zig binaries, build/test wiring, and release tags.
homebrew-tap ($HOME/workspace/tk/homebrew-tap): Homebrew formula updates/checksum bumps for released cas binaries.
Quick Start
run_cas_tool() {
local subcommand="${1:-}"
if [ -z "$subcommand" ]; then
echo "usage: run_cas_tool <conformance|conformance-suite|smoke-check|smoke_check|instance-runner|instance_runner|review-session|review_session|request> [args...]" >&2
return 2
fi
shift || true
local cas_subcommand=""
local marker=""
local -a pre_args=()
case "$subcommand" in
conformance|conformance-suite|conformance_suite)
cas_subcommand="conformance"
marker="cas_conformance_suite.zig"
;;
smoke-check|smoke_check)
cas_subcommand="smoke_check"
marker="cas_smoke_check.zig"
;;
instance-runner|instance_runner)
cas_subcommand="instance_runner"
marker="cas_instance_runner.zig"
;;
review-session|review_session)
cas_subcommand="review_session"
marker="cas_review_session.zig"
;;
request)
cas_subcommand="instance_runner"
marker="cas_instance_runner.zig"
pre_args=(--instances 1 --sample 1)
;;
*)
echo "unknown cas subcommand: $subcommand" >&2
return 2
;;
esac
install_cas_direct() {
local repo="${SKILLS_ZIG_REPO:-$HOME/workspace/tk/skills-zig}"
if ! command -v zig >/dev/null 2>&1; then
echo "zig not found. Install Zig from https://ziglang.org/download/ and retry." >&2
return 1
fi
if [ ! -d "$repo" ]; then
echo "skills-zig repo not found at $repo." >&2
echo "clone it with: git clone https://github.com/tkersey/skills-zig \"$repo\"" >&2
return 1
fi
if ! (cd "$repo" && zig build -Doptimize=ReleaseSafe); then
echo "direct Zig build failed in $repo." >&2
return 1
fi
if [ ! -x "$repo/zig-out/bin/cas" ] || [ ! -x "$repo/zig-out/bin/cas_review_session" ] || [ ! -x "$repo/zig-out/bin/cas_smoke_check" ] || [ ! -x "$repo/zig-out/bin/cas_instance_runner" ] || [ ! -x "$repo/zig-out/bin/cas_conformance_suite" ]; then
echo "direct Zig build did not produce the full CAS binary set in $repo/zig-out/bin." >&2
return 1
fi
mkdir -p "$HOME/.local/bin"
install -m 0755 "$repo/zig-out/bin/cas" "$HOME/.local/bin/cas"
install -m 0755 "$repo/zig-out/bin/cas_review_session" "$HOME/.local/bin/cas_review_session"
install -m 0755 "$repo/zig-out/bin/cas_smoke_check" "$HOME/.local/bin/cas_smoke_check"
install -m 0755 "$repo/zig-out/bin/cas_instance_runner" "$HOME/.local/bin/cas_instance_runner"
install -m 0755 "$repo/zig-out/bin/cas_conformance_suite" "$HOME/.local/bin/cas_conformance_suite"
}
local os="$(uname -s)"
if command -v cas >/dev/null 2>&1 && cas --help 2>&1 | grep -q "cas.zig"; then
if cas "$cas_subcommand" --help 2>&1 | grep -q "$marker"; then
cas "$cas_subcommand" "${pre_args[@]}" "$@"
return
fi
echo "cas binary found, but marker check failed for subcommand: $cas_subcommand" >&2
return 1
fi
if [ "$os" = "Darwin" ]; then
if ! command -v brew >/dev/null 2>&1; then
echo "homebrew is required on macOS: https://brew.sh/" >&2
return 1
fi
if ! brew install tkersey/tap/cas; then
echo "brew install tkersey/tap/cas failed." >&2
return 1
fi
elif ! (command -v cas >/dev/null 2>&1 && cas --help 2>&1 | grep -q "cas.zig"); then
if ! install_cas_direct; then
return 1
fi
fi
if command -v cas >/dev/null 2>&1 && cas --help 2>&1 | grep -q "cas.zig"; then
if cas "$cas_subcommand" --help 2>&1 | grep -q "$marker"; then
cas "$cas_subcommand" "${pre_args[@]}" "$@"
return
fi
echo "cas binary found, but marker check failed for subcommand: $cas_subcommand" >&2
return 1
fi
echo "cas binary missing or incompatible after install attempt." >&2
if [ "$os" = "Darwin" ]; then
echo "expected install path: brew install tkersey/tap/cas" >&2
else
echo "expected direct path: SKILLS_ZIG_REPO=<skills-zig-path> zig build -Doptimize=ReleaseSafe" >&2
fi
return 1
}
run_cas_tool smoke-check --cwd /path/to/workspace --json
run_cas_tool review-session start --cwd /path/to/workspace --uncommitted --json
Terminology (Instances)
- An "instance" is one
cas_proxy_client-managed codex app-server child process.
- Each instance executes one request path with isolated client metadata and optional state-file isolation.
- "N instances" means N parallel client+app-server pairs in
cas instance_runner.
Trigger Cues
- "instances" / "multi-instance" / "parallel sessions"
- "review session" / "detached review" / "reviewThreadId" / "interrupt review"
- "swarm conformance" / "claim-safe wave" / "stale-claim reclaim"
- app-server method checks (
thread/start, thread/resume, thread/fork, thread/read, thread/list, thread/archive, thread/unarchive, thread/rollback, turn/start, turn/steer, turn/interrupt, review/start)
- command/file approval behavior, especially
availableDecisions
- session mining through direct app-server method execution
- protocol sanity checks before orchestration
Workflow
-
Validate basic app-server wiring first.
run_cas_tool smoke-check --cwd /path/to/workspace --json
- Treat this as a protocol preflight before any fanout run.
-
Use review_session when the real job is detached review lifecycle control rather than one-shot probing.
- Default decision rule:
- Need detached control on Codex
0.118.x stdio: use cas review_session start --wait.
- Need detached control on a runtime that keeps detached review alive across connections: use
cas review_session split start plus wait.
- Need only a one-shot git-backed verdict: use native
codex review unless the caller explicitly needs detached control. Multi-cycle remediation loops should not CAS-first by default.
- Start detached review:
cas review_session start --cwd /path/to/workspace --uncommitted --json
cas review_session start --cwd /path/to/workspace --base main --json
cas review_session start --cwd /path/to/workspace --parent-thread-id <threadId> --parent-mode reuse --base main --json
cas review_session start --cwd /path/to/workspace --commit <sha> --title "<subject>" --json
cas review_session start --cwd /path/to/workspace --custom-instructions @review.txt --json
cas review_session start --wait --cwd /path/to/workspace --base main --fallback native-review --json
- Read current status from a fresh process when the runtime supports detached polling:
cas review_session status --review-thread-id <reviewThreadId> --json
- Wait for the detached review turn to settle when the runtime supports detached polling:
cas review_session wait --review-thread-id <reviewThreadId> --timeout-ms 300000 --json
- Supported same-process lane on Codex
0.118.x stdio:
cas review_session start --wait --cwd /path/to/workspace --base main --json
- Interrupt the detached review turn:
cas review_session interrupt --review-thread-id <reviewThreadId> --json
reviewThreadId is the handle; do not invent a second review session id.
- Review hygiene:
- On Codex
0.118.x stdio, prefer start --wait ... --json when the detached verdict matters; that is the supported lane because review items stream on the live connection.
- On runtimes that keep detached review alive across connections, prefer
start ... --json followed by wait ... --json when the verdict matters; it leaves a recoverable handle if wait times out or the process dies.
- In first-party caller workflows, treat one detached CAS attempt as one
start plus any wait retries on the returned reviewThreadId, keyed by the frozen review target plus resolved Codex path/version. If that attempt returns incompatible_codex_review_runtime, stop relaunching detached CAS for the same key in that run and let the caller decide whether to switch to native codex review.
- Reuse a parent thread only with
--parent-mode reuse plus a known materialized parent; otherwise let CAS choose auto or force fresh.
- Treat
reviewResultAvailable, compatibilityVerdict, fallbackUsed, and failureCode as the verdict surface. Do not infer success from process exit alone.
-
Detached review is the public review-control path; do not route review-session control through instance_runner.
instance_runner remains a method probe lane and is still useful for schema sanity checks.
review_session owns persisted review handles, fresh-process status polling, wait loops, and interruption.
- For workflows that need the actual review verdict, use the live connection lane the runtime supports:
start --wait on Codex 0.118.x stdio, or split start ... --json then wait ... --json on runtimes that keep detached review alive across connections.
- Treat
failureCode as authoritative. CAS never silently falls back to native codex review; callers that want a temporary fallback must do it explicitly at their own layer.
-
For swarm-hardening runs, treat $st as the durable source of truth before any worker starts.
st import-orchplan --file .step/st-plan.jsonl --input .step/orchplan.yaml
st claim --file .step/st-plan.jsonl --wave w1 --executor codex
- CAS probes the wave; it does not replace the durable claim ledger.
-
Enforce handshake assumptions when diagnosing failures.
- Confirm the session completed
initialize then initialized before method calls.
- If you see
"Not initialized" or "Already initialized", treat it as connection-lifecycle error, not a method payload error.
-
Run one direct method request (single-request lane).
run_cas_tool request --cwd /path/to/workspace --method thread/start --params-json '{"cwd":"/path/to/workspace","experimentalRawEvents":false}' --json
-
Run fanout/multi-instance requests.
run_cas_tool instance-runner --cwd /path/to/workspace --instances 12 --method thread/list --params-json '{"cursor":null,"limit":1}' --json
-
Run the conformance suite when you need repeatable swarm checks around claims or retry policy.
cas conformance --cwd /path/to/workspace --json
- Narrow to one scenario when debugging with
--scenario <name>.
- Use
--skip-smoke-check only when you intentionally want local $st scenarios without the live CAS preflight.
-
Apply overload handling on request saturation.
- If app-server returns JSON-RPC error code
-32001 ("Server overloaded; retry later."), retry with exponential backoff and jitter.
- Do not treat
-32001 as a permanent protocol mismatch.
- In
cas conformance, the retry policy scenario is currently synthetic and should be treated as retry-policy proof, not live saturation proof.
-
Drive specific thread/turn methods as needed.
- Start thread:
run_cas_tool request --cwd /path/to/workspace --method thread/start --params-json '{"cwd":"/path/to/workspace","experimentalRawEvents":false}' --json
- Start turn:
run_cas_tool request --cwd /path/to/workspace --method turn/start --params-json '{"threadId":"thr_123","input":[{"type":"text","text":"summarize the repo status"}]}' --json
- Thread read:
run_cas_tool request --cwd /path/to/workspace --method thread/read --params-json '{"threadId":"thr_123","includeTurns":true}' --json
- Resume thread:
run_cas_tool request --cwd /path/to/workspace --method thread/resume --params-json '{"threadId":"thr_123"}' --json
- Steer turn:
run_cas_tool request --cwd /path/to/workspace --method turn/steer --params-json '{"threadId":"thr_123","expectedTurnId":"turn_abc","input":[{"type":"text","text":"continue"}]}' --json
- Interrupt turn:
run_cas_tool request --cwd /path/to/workspace --method turn/interrupt --params-json '{"threadId":"thr_123","turnId":"turn_abc"}' --json
- Use method-specific params for list/mine flows.
thread/list supports filter params (cursor, limit, searchTerm, cwd, etc.) as provided by your app-server version.
turn/steer requires expectedTurnId.
- Gate experimental methods and payload fields explicitly.
- Experimental surfaces such as
thread/backgroundTerminals/clean, thread/realtime/*, and thread/start dynamic-tool fields require initialize.params.capabilities.experimentalApi = true.
- If omitted, treat failures as capability negotiation errors.
- Respect native CAS server-request limits.
- The current Zig client auto-answers
item/commandExecution/requestApproval, item/fileChange/requestApproval, item/permissions/requestApproval, item/tool/requestUserInput, mcpServer/elicitation/request, and item/tool/call.
- Default native behavior is conservative: permissions requests are denied, request-user-input questions use the first option label when present, MCP elicitations are declined, and dynamic tool calls return
success: false unless you override with explicit CLI flags.
Approval and Request Semantics
- Exec/file approval decisions are handled by the Zig client (
--exec-approval, --file-approval, --read-only).
- Permission approvals can be controlled with
--permissions-approval deny|grant-turn|grant-session.
item/tool/requestUserInput, mcpServer/elicitation/request, and item/tool/call can be overridden with --request-user-input-response-json, --elicitation-action plus --elicitation-content-json, and --dynamic-tool-response-json.
cas review_session now accepts the same approval/runtime overrides as cas instance_runner; use them when detached review must be permissioned or fully deterministic under approval prompts.
- For command approvals, CAS resolves decisions against server-provided
availableDecisions when present.
- Unknown server-request methods are rejected fail-closed in native mode to prevent deadlocks.
- For overload responses (
-32001), CAS callers should retry with exponential backoff and jitter.
Scope Boundaries (Zig-Only Cutover)
- This skill no longer exposes a Node JSONL proxy lifecycle.
- Legacy message envelopes (
cas/request, cas/respond, cas/send, cas/state/get, cas/stats/get) are removed from this skill contract.
- Dynamic tool reply loops are supported only through static response payloads passed on the CAS CLI; native CAS is not a full interactive tool-runtime host.
cas review_session persists raw request/response artifacts and detached review handles; it is not a generalized streaming event mirror for all app-server notifications.
Canonical Schema Source
Use your installed codex binary to generate schemas that match your version:
codex app-server generate-ts --out DIR
codex app-server generate-json-schema --out DIR
codex app-server generate-ts --experimental --out DIR
codex app-server generate-json-schema --experimental --out DIR
Local References
Read references/codex_app_server_contract.md for API/method notes that inform CAS request usage.
Resources
cas binary dispatcher:
cas conformance
cas review_session
cas smoke_check
cas instance_runner
cas_conformance_suite binary: swarm conformance around $st claims and retry policy.
cas_review_session binary: detached review start/status/wait/interrupt with persisted reviewThreadId handles.
cas_smoke_check binary: protocol/API smoke validation.
cas_instance_runner binary: single or multi-instance method execution.
Runtime bootstrap policy mirrors seq: require installed cas Zig binaries, default to brew install tkersey/tap/cas on macOS, and fallback to direct Zig install from skills-zig on non-macOS.