with one click
agy-headless-evidence
Run AGY headlessly via scheduled ticks or `agy -p`, capture agentapi JSONL evidence, and validate automated AGY loops or event streams.
Menu
Run AGY headlessly via scheduled ticks or `agy -p`, capture agentapi JSONL evidence, and validate automated AGY loops or event streams.
Resume sessions across Claude Code, Codex, Gemini, and other providers when switching agents or migrating active chat history.
Wire MCP servers and AgentOps plugin bundles into the AGY image with least-privilege access, rollback evidence, and validation hooks.
Manage the PROGRAM.md/AUTODEV.md contract consumed by evolve/factory ticks. Use for loop rules, boundaries, or PROGRAM.md repair.
Audit SKILL.md files against the AgentOps template and readiness checks. Use for quality reviews or template compliance.
Run AGY headlessly via scheduled ticks or `agy -p`, capture agentapi JSONL evidence, and validate automated AGY loops or event streams.
Wire MCP servers and AgentOps plugin bundles into the AGY image with least-privilege access, rollback evidence, and validation hooks.
| name | agy-headless-evidence |
| user-invocable | false |
| description | Run AGY headlessly via scheduled ticks or `agy -p`, capture agentapi JSONL evidence, and validate automated AGY loops or event streams. |
| practices | ["design-by-contract","evidence-over-assertion"] |
| hexagonal_role | supporting |
| consumes | ["agy-native"] |
| produces | ["agy-evidence-dir"] |
| context_rel | [{"kind":"customer-of","with":"agy-native"},{"kind":"supplier-to","with":"validate"}] |
| skill_api_version | 1 |
| context | {"window":"inherit","intent":{"mode":"task"},"sections":{"exclude":["HISTORY"]},"intel_scope":"topic"} |
| metadata | {"tier":"execution","dependencies":["dcg","agy-native"],"stability":"experimental"} |
| output_contract | A timestamped run directory containing events.jsonl (the headless event stream), last-message.{txt,json} (the final agent message), exit-code (captured $?), and command.txt (argv/cwd/model/scopes) — the validator's proof surface; mirrored as a userFacing brain artifact. |
Run Antigravity (AGY) headlessly — a one-shot agy -p or a scheduled tick driven through the
agentapi sidecar — and leave a JSONL proof surface a validator can read back after the
session ends: the event stream, the final message, the exit code, and the exact command. Scope in,
evidence out.
The factory dispatches AGY workers non-interactively. A worker that only prints prose to a terminal leaves nothing a validator can verify later (per the cross-agent rule: you read a worker's published compression, never its live session). This skill makes every headless AGY run produce a durable, inspectable artifact.
AGY's headless surface (distinct from gemini-cli — there is no gemini -p, no gemini mcp):
agy -p "<prompt>" / agy --print runs the prompt, prints, and exits
(--print-timeout default 5m). -c/--continue resumes the latest conversation;
--conversation <id> resumes by id.agy each time — useful
for recurring loops where you want a persistent brain and warm conversation state.--add-dir <dir> (repeatable) bounds which repos a run can touch; pair with a scoped
git worktree so concurrent roles never share a working tree.~/.gemini/antigravity-cli/{brain,knowledge}/
— the canonical place to mirror a run's verdict so a different context can consume it.Use it whenever a headless AGY run is part of an automated loop and someone (or something) downstream must trust the result.
echo "$?" > exit-code on the line right after
agy -p returns. Why: a plausible final answer with a non-zero exit is still a failed run;
validators must key off process reality, not self-report (evidence over comfort).--dangerously-skip-permissions; a scoped author gets --add-dir to exactly its worktree.
Why: the permission + scope flags are the runtime boundary, not a convenience.dcg guard stays on. ~/.gemini/settings.json wires a BeforeTool hook on
run_shell_command to dcg; keep it even under --dangerously-skip-permissions. Why: the
auto-approve flag would otherwise let a destructive command through — dcg is the floor.--add-dir), and whether
it ran one-shot or via the sidecar. Why: a run that cannot be reproduced is weak evidence.claude -p / claude --print to
"do the same for Claude." Why: claude -p bills the API per-token, not the Max sub, and is
banned for worker dispatch; Claude workers go through NTM panes / spawned subagents. AGY runtime is
agy / agy -p.Decide whether the run is an author, validator, researcher, or tie-breaker, and pick the scope + permission posture from that role before launching.
| Role | Scope | Permission posture |
|---|---|---|
| Author (edits) | --add-dir to one worktree | --dangerously-skip-permissions (dcg still on) |
| Validator (read-mostly) | --add-dir to the repo, no writes | no skip-permissions; edits forbidden |
| Researcher | --add-dir read context | no skip-permissions |
| Externally-sandboxed batch worker | scoped worktree | skip-permissions only by explicit policy |
Checkpoint: confirm the role, its --add-dir scope, and that you are NOT granting an author's
posture to a validator before launching.
RUN_DIR="$(pwd)/.agy-evidence/$(date -u +%Y%m%dT%H%M%SZ)-${ROLE:-run}"
mkdir -p "$RUN_DIR"
{
printf 'cwd=%s\n' "$PWD"
printf 'mode=%s\n' "${AGY_MODE:-oneshot}" # oneshot | sidecar
printf 'scopes=%s\n' "$REPO"
printf 'cmd=%s\n' 'agy -p <prompt> --add-dir <repo> --print-timeout 600'
} > "$RUN_DIR/command.txt"
Add model.txt if a non-default --model is used; add scope.txt for edit runs.
Checkpoint: command.txt exists and records cwd, mode (oneshot vs sidecar), and scope.
One-shot author run (scoped worktree, dcg on):
agy -p "Claim one ready bead via br. Implement only it in this worktree. \
Commit scoped. Write evidence to brain as userFacing. Do NOT close it — a judge will." \
--add-dir "$REPO" --dangerously-skip-permissions --print-timeout 600 \
> "$RUN_DIR/events.jsonl" 2> "$RUN_DIR/stderr.log"
echo "$?" > "$RUN_DIR/exit-code"
Sidecar / scheduled-tick run (persistent server; resume warm state by id):
agy -p "Validate bead <id> against its evidence artifact ONLY. You did not author it. \
Emit VERDICT: PASS|WARN|FAIL to brain as a userFacing verdict. Do not edit code." \
--conversation "$CONV_ID" --add-dir "$REPO" --print-timeout 600 \
> "$RUN_DIR/events.jsonl" 2> "$RUN_DIR/stderr.log"
echo "$?" > "$RUN_DIR/exit-code"
Do not broaden --add-dir without recording why in scope.txt.
Checkpoint: exit-code is written and events.jsonl is non-empty before declaring the run done.
Persist a userFacing artifact so a different context can consume it (per agy-native author!=judge):
~/.gemini/antigravity-cli/brain/<conversation-id>/<name>_verification.md
(+ .metadata.json, userFacing:true).Then check the evidence holds:
test -s "$RUN_DIR/exit-code"
test "$(cat "$RUN_DIR/exit-code")" = 0
test -s "$RUN_DIR/events.jsonl"
test -s "$RUN_DIR/command.txt"
If any check fails, the downstream verdict is FAIL or NEEDS-EVIDENCE.
Checkpoint: the run-dir path (and brain artifact) is referenced in the bead / Agent Mail compression so the evidence is discoverable downstream.
Format: a per-run directory of plain files (JSONL + text + exit code), mirrored to a brain artifact.
Filename / path: <workdir>/.agy-evidence/<UTC-timestamp>-<role>/
Structure:
events.jsonl — the captured headless event stream (REQUIRED proof surface)last-message.txt or last-message.json — the final agent message (REQUIRED)exit-code — captured $? (REQUIRED)command.txt — argv, cwd, mode (oneshot|sidecar), model, --add-dir scopes (REQUIRED)stderr.log — captured stderr (recommended)changed-files.txt, scope.txt, model.txt, verdict.md~/.gemini/antigravity-cli/brain/<conversation-id>/<name>_verification.md (userFacing:true)exit-code immediately after the run and used in the verdict (Rule 1)events.jsonl (Rule 2)dcg BeforeTool hook present in ~/.gemini/settings.json (Rule 4)events.jsonl captured and treated as source of truth over the pretty stream (Rule 5)command.txt records argv/cwd/mode/model/scopes — reproducible (Rule 6)claude -p / claude --print anywhere; runtime is agy -p (Rule 7 / LAW 0)agy -p "Validate bead AG-123 read-only. VERDICT: PASS|FAIL." --conversation "$CONV_ID" --add-dir "$REPO" > run/events.jsonl 2> run/stderr.log; echo $? > run/exit-codeagy -p "Implement AG-123 in this worktree; commit scoped; evidence to brain; do not close." --add-dir "$WT" --dangerously-skip-permissions > run/events.jsonl; echo $? > run/exit-codeagy -p --model "Gemini 3.1 Pro (High)", judge agy -p --model "Claude Opus 4.6 (Thinking)" — two contexts, one loop, no shared session.| Problem | Cause | Solution |
|---|---|---|
Empty events.jsonl but pretty output appeared | stdout not redirected to the file | redirect agy -p stdout to events.jsonl |
| Headless run exits empty | --print-timeout hit or no model reachable | raise --print-timeout; confirm agy models lists a model; check OAuth in ~/.gemini/settings.json |
| Validator made edits | author posture given to a validator | rerun without --dangerously-skip-permissions; enforce read-mostly scope |
| Run "succeeded" but downstream is wrong | exit code ignored | always echo $? > exit-code; key the verdict off it |
| Worker tried a destructive command | auto-approve under --dangerously-skip-permissions | the dcg BeforeTool hook should block it — confirm it's wired in ~/.gemini/settings.json |
| Judge agreed with author too easily | warm context reused (-c/--continue) | start a fresh conversation (no --continue); read-mostly scope |
| Cannot resume the tick's state | no conversation id captured | record --conversation <id> in command.txt; reuse it for the sidecar tick |
consumes.dcg — destructive-command guard; the BeforeTool floor this skill keeps on.agentops:validate — produces the PASS/WARN/FAIL verdict over this proof surface.~/dev/control-plane/migrations/gemini-to-agy.md (AGY ≠ gemini-cli; LAW 0).