| name | peer |
| description | The /peer command. Bidirectional peer-call workflow between Claude Code and OpenAI Codex CLI — call the other AI for second-opinion review or sandboxed worker implementation, with verdict-driven iteration up to 3 rounds (uncapped under nuclear mode). On first invocation in a project, bootstraps the per-project folder layout idempotently. After bootstrap, /peer <args> parses per the grammar in this file (light/medium/heavy/nuclear modes; review/worker/preview actions; status/resume/cancel/doctor/help/config/ask/uninstall subcommands). User-invoked only. |
| disable-model-invocation | true |
| argument-hint | <task> | /peer-help for full list |
Peer-Call Workflow Skill
Activates the bidirectional peer-call workflow between Claude Code and OpenAI Codex CLI for the rest of this conversation. Either AI can call the other for review (read-only) or worker (sandboxed git worktree) work; both peers iterate to AGREE under a verdict-driven protocol.
This skill is the single source of truth for the workflow rules. Editing SKILL.md propagates to every project that invokes the skill on next session. Per-project state is just .agent-work/ and the optional peer-deny-bash.txt.
/peer is the single entry point. Every /peer invocation triggers two things in order: (1) run the bootstrap section below (idempotent — no-op if already done in this project), then (2) parse and execute the command per the /peer user-facing command section. Bare /peer (no args) prints a one-line status confirming the skill is active and points the user at /peer help.
OPERATING PROTOCOL (must survive compaction)
The rules below are the live workflow. When this skill is active, they govern every peer call.
Shared protocol doc. The cross-AI rules (triggers, skip rules, iteration protocol, severity buckets, REVISE markers, status-file schema, file naming, safety rules, fanout roles, prompt-language) live in ~/.claude/skills/peer/protocol.md. That doc is shared with the Codex /peer skill so edits propagate to both sides. Sections below that have been moved to protocol.md are replaced with a pointer; sections that are Claude-specific (plan-mode interaction, hooks, command parser, install steps, mac path-rendering quirks) stay in this file.
Plan-mode interaction
When the user invokes /peer or any /peer-* wrapper while Claude Code's plan mode is active, this skill takes precedence over plan mode's generic 5-phase Explore/Plan/Review workflow. The peer skill manages its own plan-iterate / impl-iterate flow, calls its peer (Codex/Claude) for review, and applies the verdict-driven iteration protocol below — all of which substitutes for plan mode's generic workflow on this turn.
The bootstrap installs a UserPromptSubmit hook at ~/.claude/hooks/peer-route.py (registered in ~/.claude/settings.json) that injects an additionalContext override on every /peer* prompt to enforce this routing precedence at the harness level. The hook is the primary defense; this prose note is the fallback when the hook is not yet installed (e.g., very first invocation on a fresh machine before bootstrap step U5 has run).
Protocol-adherence guardrail (Piece 2)
When the user typed /peer, /peer-light, /peer-medium, /peer-heavy, /peer-nuclear, or /peer <mode> <task>, the plan you propose MUST include peer iteration via the appropriate ask-{claude,codex}-{review,worker}.py script. Three failure modes are protocol violations, not tradeoffs:
- Skipping iteration entirely. Proposing a plan that omits peer iteration when the user invoked
/peer*. Examples: "Single-pass audit, no peer iteration with Codex" as a default. The user typed the slash command precisely to get iteration.
- User-approval-as-peer-substitute. Treating the user's "yes proceed" on your plan as a substitute for peer-call iteration. The user is not the peer. User approval locks scope; the peer validates substance.
- Mid-round short-circuit after REVISE. Receiving a REVISE verdict from the peer, integrating items on your own judgment, and declaring "effective AGREE by acknowledgement" without sending the round-N+1 prompt back to the peer. Round trips are mandatory: REVISE → mark each item APPLIED/REJECTED/ACKNOWLEDGED → send → peer verdicts again. The 3-round cap (or uncapped under nuclear) is a ceiling, not a target. The only way to terminate iteration before peer AGREE is via the explicit "Early stop" rules below (remaining items OPTIONAL only, peer ESCALATE, or 3 rounds elapsed).
Piece 6's script-level enforcement (invoke_piece6_check in ~/.claude/skills/peer/scripts/lib/peer_lib/piece6_check.py) backstops failure mode #3 structurally: the peer-call scripts refuse to fire if a prior round was REVISE without proper APPLIED/REJECTED/ACKNOWLEDGED markers in the new prompt. Failure modes #1 and #2 are caught only by this rule plus the user noticing.
If you genuinely believe a round-trip is low-value (e.g., the user explicitly told you to abandon iteration mid-task), surface that as an explicit ASK-FOR-OVERRIDE question to the user before acting on it. Then if the user confirms, pass -AcknowledgeAbort "<60+ char concrete reason citing files/lines/phase>" -AcknowledgeAbortConfirm "<task-slug>" to the script to record the override in the status file. Never pocket the round on your own.
Tools and script paths
Reviewer scripts (read-only):
~/.codex/tools/ask-claude-review.py (Codex calls Claude)
~/.codex/tools/ask-codex-review.py (Claude calls Codex)
Worker scripts (isolated git worktree, can edit code):
~/.codex/tools/ask-claude-worker.py (Codex calls Claude)
~/.codex/tools/ask-codex-worker.py (Claude calls Codex)
The worker scripts' built-in sandboxes substitute for case-by-case authorization. Either agent may invoke either worker without per-task explicit approval, provided the triggers below are met. Do not give the peer direct write access to the main worktree — always use the worker script.
Worker worktrees live at <home>/.peer-worktrees/<peer>/<project-id>/<branch-name>/. To remove a worktree: git worktree remove --force <home>/.peer-worktrees/<peer>/<project-id>/<branch-name>. (Always --force: workers do NOT auto-commit, so the worktree is dirty at DONE; plain worktree remove errors with "contains modified or untracked files".) Branches <peer>/<branch-name> and <peer>-base/<branch-name> can be deleted with git branch -D.
Invocation pattern:
py -3 "<script>" --input-file "<prompt>" [--worktree-name "<name>"]
MUST call the peer when
See protocol.md §MUST call the peer when.
Skip rules (explicit exceptions to the MUST-call triggers)
See protocol.md §Skip rules.
Loop prevention
See protocol.md §Loop prevention.
Effort defaults
Scope: This section governs DIRECT script invocations only — i.e., when the user or a non-/peer code path runs ask-codex-{review,worker}.py or ask-claude-{review,worker}.py directly. For /peer <mode> dispatches, the modes table in section 2 ("Modes (effort + iteration depth)") is authoritative and the difficulty-bump rule below DOES NOT apply. See "Dispatcher-to-script effort mapping" below the modes table for the locked per-mode lookup.
- Codex (
gpt-5.5): default low. Bump to medium only if the task is difficult.
- Claude (
claude-opus-4-7): default xhigh. Bump to max only if the task is difficult.
A task counts as "difficult" if any of these apply: many moving parts, security-sensitive, novel/unfamiliar problem, you've already failed once at the default effort, or the change has lasting/expensive consequences. Don't bump effort on simple or routine tasks — it wastes tokens.
Why these defaults: Parker explicitly chose gpt-5.5 low / claude-opus-4-7 xhigh as the baseline. Pre-existing project defaults from other templates are NOT preserved — this skill's defaults govern.
Override at invocation:
py -3 "<script>" --input-file "<prompt>" --effort medium
py -3 "<script>" --input-file "<prompt>" --effort max
Clarifying questions — ask the user when you have one (don't assume)
See protocol.md §Clarifying questions.
Iteration-to-agreement protocol
See protocol.md §Iteration-to-agreement protocol. Covers prompt structure, round structure, verdicts (AGREE/REVISE/ESCALATE), severity buckets, APPLIED/REJECTED/ACKNOWLEDGED markers, light-mode caller rule, early stop, round 2+ narrowing, nuclear convergence, judgment-validation loop, and verification.
Claude-specific addenda:
- Verdict extraction. Scan the response for the
## VERDICT section. The verdict word is the first whitespace-delimited token after that heading; match case-insensitively against AGREE / REVISE / ESCALATE. If no ## VERDICT section exists, treat the response as ESCALATE. The scripts do NOT parse verdicts — Claude extracts them before deciding whether to start round N+1.
- Light-mode violation example (screenshot bug): Codex returns REVISE 0-BLOCKING / 2-SHOULD-FIX / 2-OPTIONAL. Caller auto-applies all four items because "each one strictly improves the design without trade-offs." This is a protocol violation even if the caller's judgment about quality is correct. The user chose light precisely to remain in control of what gets applied.
- Response storage. Save raw response to
.agent-work/<peer>-responses/[CR|XR]-<short-task>.md if longer than ~1 KB; otherwise inline in chat under "Peer raw response".
Status file (live progress tracking)
For any task that crosses the 8-min MUST-call trigger or uses worker mode, the caller MUST create a status file BEFORE the first peer call AND post its bare relative path (resolvable under the chat's current working directory) in chat as the first thing the user sees about the task. The user opens that one file in their editor and watches it update — single source of truth, no digging.
Light mode exception: /peer light never creates a status file regardless of size, action, or risk-keyword match. Light is intentionally lightweight — one-pass advisory, no tracking infrastructure. If a light-mode task is large enough that you'd want a status file, that's the signal to upgrade to medium (or use --force if you really want a long advisory pass with no tracking).
Silent skips are not allowed. If you decide not to create a status file, state explicitly in your turn-end summary: Status-file rule fired — chose not to create because <specific reason>. The peer-call scripts emit a [STATUS-FILE REMINDER] banner on every invocation; if you see that banner and have not created or named a status file, you have not yet complied with this rule.
Path: <chat-cwd>/.agent-work/status/<task-id>.md. The chat CWD is the directory the chat session is rooted at, NOT necessarily the codebase being edited; when those differ (e.g., editing ~/.claude/skills/peer/ from a chat at ~/Desktop/Misc/), .agent-work/ lives at ~/Desktop/Misc/.agent-work/. Task ID: short kebab-case slug matching the prompt-file naming.
Initial creation: before round 1 fires. Header populated, round log empty. Update after every round: overwrite the header fields, append a new round entry to the log.
Path formatting (mac-verified; the renderer rules differ between chat output and status-file content, so the two cases have different rules):
-
Chat-message paths. Every file path posted in chat MUST be a bare relative path resolvable under the chat's CWD — no markdown link brackets, no @ prefix, no absolute paths. Claude Code's mac chat renderer only makes a path clickable when it (a) has no surrounding markdown syntax and (b) resolves to a file under the chat CWD. Markdown-link syntax [label](path), absolute paths (/Users/.../file.md), @-mentions, and file:// URIs all render as plain text and cannot be opened. Example of a clickable chat path: .agent-work/codex-prompts/XP-task-plan-r1.md.
-
Status-file internal references. Paths inside a status file's content (round-log entries, "Active artifacts" links, etc.) ARE allowed to use markdown link syntax. The status file is rendered by Claude Code's side-panel file viewer, which handles markdown links correctly. Use paths relative to the status file's own directory. From .agent-work/status/, sibling artifact folders are reached with ../. Examples:
- To a prompt:
[Plan round 1 prompt](../codex-prompts/XP-task-plan-r1.md)
- To a response:
[Plan round 1 response](../codex-responses/XR-task-plan-r1.md)
- To a file at the project root:
[CLAUDE](../../CLAUDE.md)
The add_peer_status_round_entry helper already emits this form; do not change it.
Getting either rule wrong silently breaks the click-to-open UX. Operational consequence: every artifact this skill produces (status files, prompts, responses, worker proposals) MUST live under the chat's CWD. The convention "<project>/.agent-work/ where project = the codebase being edited" only works when the chat is rooted at that codebase. When the chat is rooted elsewhere, .agent-work/ lives at <chat-cwd>/.agent-work/, NOT at <edited-codebase>/.agent-work/.
Worker mode is materially degraded on mac: worktrees at ~/.peer-worktrees/<peer>/<project-id>/<branch>/ live outside any reasonable chat CWD and cannot have clickable chat links. For tasks where clickable artifacts matter, prefer review-mode + direct caller edits over worker-mode on mac.
Header field spacing: the header fields below MUST be separated by blank lines exactly as shown. Markdown collapses adjacent **Field:** lines into a single paragraph otherwise.
Quickest path to a conformant file: the bootstrap installs .agent-work/status/.template.md per project (P1 substep, content-hash refresh). Copy it: cp .agent-work/status/.template.md .agent-work/status/<task-id>.md, then edit the placeholders. Do NOT generate the structure from memory — Piece 9's schema validator (peer_lib.status_schema_check.invoke_status_file_schema_check invoked from the ask-*.py scripts) hard-blocks the next peer call if the file diverges from the canonical schema.
Format:
# Status: <task-id>
**Description:** <one-line task description>
**Status:** in-progress | DONE | ESCALATE - needs your decision | ERROR
**Phase:** plan | implementation | done
**Round:** plan/impl round N of 3 (or "done" / "blocked")
**Last verdict:** AGREE | REVISE (B:n S:n O:n) | ESCALATE
**Codex tokens (cumulative):** ~Nk (sourced from `tokens used` line in script stdout; Claude tokens not tracked here — the harness exposes no counter to the script or to the calling AI)
---
## Active artifacts
(Direct file links the user can right-click open. Update as new artifacts appear.)
- Worktree: `<home>/.peer-worktrees/<peer>/<project-id>/<branch-name>/`
- Worker proposal: `[Worker proposal](../<peer>-workers/<CW|XW>-<task>.md)` (only when worker has run; the exact filename is whatever the caller's prompt instructed the worker to write — the worker scripts no longer auto-derive an output path)
- Latest prompt: `[Latest prompt](../<peer>-prompts/<CP|XP>-<task>-<phase>-r<round>.md)`
- Latest response: `[Latest response](../<peer>-responses/<CR|XR>-<task>-<phase>-r<round>.md)`
- Specific files being worked on (list any that the worker is editing)
---
## What I need from you
(only present when Status = ESCALATE; otherwise omit this section)
- <plain-English decision needed, no jargon>
---
## Round log
### Round 1 — plan, REVISE
- [Round 1 prompt](../codex-prompts/XP-<task>-plan-r1.md)
- [Round 1 response](../codex-responses/XR-<task>-plan-r1.md)
- One- or two-sentence summary of what happened and what's next.
- Optional 2-3 bullets ONLY if material. Skip otherwise.
Length rule: keep each round entry tight. 1-2 sentence summary. 2-3 bullets only if they would change what the user does next. Don't waste tokens narrating things visible in the linked files.
Status transitions:
- in-progress → DONE on final-round AGREE that ends iteration
- in-progress → ESCALATE when the protocol surfaces a decision to the user (always populate "What I need from you")
- in-progress → ERROR if a script call fails, hangs, or aborts (note the failure briefly, suggest next step)
Round log is append-only. Never delete prior round entries; the history is itself useful.
Folder layout: flat. One status file per task in .agent-work/status/, no per-task subfolders. Same convention applies to prompt/response/proposal files in their respective folders — task ID lives in the filename, not a subfolder.
Task completion: when a task reaches a final state, the AI does the following:
- Update the status file's
Status: field to DONE, ESCALATE, or ERROR as appropriate. The file stays in .agent-work/status/ — no folder move, no archive. The Status: field is the source of truth for whether a task is active or finished.
- Clean up the worktree branches and folder if a worker ran for this task AND status is
DONE:
git worktree remove --force <home>/.peer-worktrees/<peer>/<project-id>/<branch-name>
git branch -D <peer>/<branch-name> and git branch -D <peer>-base/<branch-name>
- Tell the user the task is complete in chat (no archived-path link needed since nothing moved).
Files with Status = ESCALATE or ERROR retain their worktrees until the underlying issue is resolved. Cleanup of old DONE status files is opt-in and manual. Lookup rule: when searching for prior task context, search .agent-work/status/. Both active and completed tasks live there; the Status: field tells you which.
Worker isolation guarantees (read this before invoking worker mode)
See protocol.md §Worker isolation principles for the cross-AI principle and trigger conditions. Claude-specific deny-tool list:
- Claude worker hard-denies main-repo writes,
git push, common HTTP clients (curl, wget, iwr, Invoke-WebRequest, Invoke-RestMethod, Start-BitsTransfer), shell-read of .env* / .envrc files, and node/python eval flags (-e, --eval, -p, --print, -c) at the Claude CLI permission layer.
- Codex worker uses
--sandbox workspace-write --cd <worktree> for filesystem isolation; network egress and other denies rely on advisory system-prompt rules.
File naming
See protocol.md §File naming convention.
Safety rules (apply to both directions)
See protocol.md §Safety rules.
/peer user-facing command
/peer is the user-facing entry point that wraps the protocol above. The dispatcher (you) parses the command, picks effort/action/subagent settings, drafts the prompt, invokes the right script, and renders results.
1. Grammar
Command shape:
/peer [mode] [action] [flags] <task>
Parsing rules (apply in order):
- Mode consumption. Token 1 after
/peer: if it matches light / medium / heavy / nuclear, consume as mode. Otherwise mode defaults to medium and the token stays in play for the next rule.
- Action / subcommand consumption. Next token: if it matches an action (
review / worker / preview) or a top-level subcommand (status / resume / cancel / doctor / help / config / ask / uninstall), consume.
- Flags. Tokens prefixed
-- consumed as flags (--subagents, --budget <N>, --force, --user).
- Task body. Remaining tokens form
<task>.
- Disambiguation rule. If after step 2 the FIRST TOKEN OF THE TASK BODY matches a known command name (
light/medium/heavy/nuclear/review/worker/preview/status/resume/cancel/doctor/help/config/ask/uninstall) AND the user did not use --, error with: Did you mean '<token>' as a command, or as part of the task? Use '/peer ... -- <task>' to force task interpretation, or restructure your command.
-- escape. /peer -- <text> consumes nothing; whole tail becomes <task> in default mode (medium, no action). /peer <action> -- <text> keeps the action and forces the rest as task.
- Preview-wraps-mode special form. When
preview is the consumed action (step 2), the parser also looks at the NEXT token for a mode word and consumes it if present. So /peer preview heavy implement payments parses as preview/heavy/"implement payments". This sub-rule applies ONLY to preview.
Worked examples:
| Input | mode | action | task | Notes |
|---|
/peer fix the login bug | medium | auto | fix the login bug | bare; mode default; auto picks review vs worker |
/peer heavy refactor the auth module | heavy | auto | refactor the auth module | mode consumed |
/peer light review this snippet | light | review | this snippet | both consumed |
/peer heavy worker --subagents implement payments | heavy | worker | implement payments | flag survives |
/peer preview heavy implement payments | heavy | preview | implement payments | preview is the action; second-token mode consumed (preview wraps another command's mode) |
/peer review the auth flow | medium | review | the auth flow | parses cleanly; first task token = "the" (not a command) |
/peer review status | medium | review | (errors) | first task token = status, a command |
/peer nuclear refactor the auth module | nuclear | auto | refactor the auth module | uncapped iteration; 2-step gate fires before any peer call (see subsection 14) |
/peer review nuclear | medium | review | (errors) | first task token = nuclear, a known mode |
/peer review -- status | medium | review | status | -- escape forces task |
/peer -- review the auth flow | medium | auto | review the auth flow | full escape |
/peer uninstall | medium | uninstall | (no task) | typed-only; not in dropdown; runs the uninstall handler (subsection 16) |
Note: an earlier draft of peer-v1-spec.md line 32 listed /peer review the auth flow as an error case (now corrected). Per the parser rules above this is a valid invocation: action=review, task="the auth flow"; the first task token is the, not a command, so no error fires.
2. Modes (effort + iteration depth)
| light | medium (default) | heavy | nuclear |
|---|
| Codex effort | low | low | medium | high |
| Claude effort | mid | high | xhigh | max |
| Iteration | 1 round, advisory only | up to 3 rounds | up to 3 plan + 3 impl rounds | uncapped; loop-detection convergence |
| Worker auto-trigger | never (review only) | >15 min + clean boundary + verification gate | >5 min + clean boundary | always (worker by default; explicit review action overrides) |
| Subagents | never | reviewer's call based on diff size | always on (A/B/C; D for risky tasks) | mandatory caller-side A+B+C+D (4 sub-agents); worker/reviewer-side fanout per existing rules |
| Caller-side subagent model floor (v1.6.2) | n/a | Haiku (default) | Haiku for A/B/C; Sonnet for D (security) | Sonnet floor for A/B/C; Opus for D (security). Haiku forbidden for nuclear caller-side. |
| Status file | never | mandatory at 8-min threshold | always | always |
| Risk detected (auth/secrets/payments/migrations/destructive) | asks to upgrade to medium | proceeds | proceeds | proceeds (already at max) |
| Tiny task detected (one-line typo, etc.) | proceeds | proceeds | asks to downgrade to medium | asks to downgrade to medium |
| User gate | none | none | none | 2-step: y/n + typed NUCLEAR (see subsection 14) |
max Claude effort is the nuclear-mode default; available for other modes via /peer config set claude_effort max or future override flag.
Dispatcher-to-script effort mapping (LOCKED — do not bump)
When dispatching a /peer mode invocation, the dispatcher MUST pass these exact -Effort values to the underlying scripts:
| /peer mode | Codex --effort | Claude -Effort |
|---|
| light | low | mid |
| medium | low | high |
| heavy | medium | xhigh |
| nuclear | high | max |
These are the built-in per-mode values. Explicit user/project config overrides from /peer config set codex_effort or /peer config set claude_effort still win; do not apply any automatic difficulty bump after resolving config.
Do NOT apply the difficulty-bump rule from the "Effort defaults" section on top of the mode mapping. The mode IS the effort selector for /peer dispatches.
Common bug — heavy ≠ nuclear effort. Passing --effort high for /peer-heavy is wrong unless config explicitly overrides it. high is the NUCLEAR-tier Codex effort, not heavy. Heavy = medium. If you are about to invoke ask-codex-*.py --effort high and the active mode is heavy (with no codex_effort config override), STOP and use medium instead.
3. Actions (independent of mode)
| Action | Behavior |
|---|
review | Read-only opinion. Peer cannot edit code. |
worker | Sandboxed implementation. Peer creates a worktree, edits in isolation, returns a proposal. Caller integrates manually. |
preview | Shows what would happen (which mode, which AI, what prompt) without actually calling anyone. Free. |
If no action specified, the mode's auto-trigger logic decides between review and worker (see modes table's Worker auto-trigger row).
4. --subagents flag
Independent of mode and action. Multiple sub-agents tackle the task from different angles in parallel; parent synthesizes.
--subagents on review = R-1 (correctness), R-2 (robustness), R-3 (security, conditional on risk-keyword match).
--subagents on worker = A (simplest), B (robust), C (tests), D (security, conditional on risk-keyword match).
- Heavy mode includes
--subagents automatically. Medium uses reviewer's judgment. Light errors if --subagents is passed.
- Nuclear mode mandates 4 caller-side sub-agents A+B+C+D and ignores
--subagents if passed (subagents always on; cannot disable). Worker/reviewer-side internal fanout follows existing rules (nuclear does not change what the peer worker spawns inside its own process).
5. --budget <N> flag
v1 advisory only. The dispatcher embeds the budget in the prompt's ## Caller Constraints / User Decisions section as: Budget: ~N tokens for this iteration. Plan accordingly. Both AIs are asked to respect it. No script-level enforcement. Project-wide defaults via /peer config set <mode>_budget <N>. Hard enforcement deferred to v2 (Codex feasible via stdout parse + kill; Claude advisory-only for the foreseeable future).
6. Per-command behavior
-
/peer <task> — medium mode default, auto-action based on the modes table.
-
/peer light <task> — quick second opinion. One pass, no iteration. No status file ever (intentionally lightweight). After the peer responds, the caller surfaces the response + a 1–2 sentence stance, then stops — no implementation, no auto-apply, no further iteration regardless of verdict. The user decides what to do next. If risk detected (see Risk detection subsection), prompts to upgrade to medium.
-
/peer heavy <task> — high scrutiny, subagents auto-on, longer iteration. Codex --effort medium, Claude -Effort xhigh. If tiny-task detected, prompts to downgrade to medium.
-
/peer nuclear <task> — uncapped iteration above heavy. Claude max effort, Codex high effort. Multi-agent caller-side sub-agents A+B+C+D mandatory (4 sub-agents attacking the task in parallel; cannot disable). Worker/reviewer-side fanout follows existing rules — nuclear does not alter what the peer worker spawns internally. Worker mode by default; pass review to force read-only nuclear. Tiny-task signal still prompts to downgrade. REQUIRES 2-step user confirmation (see subsection 14, Nuclear gate). Status file always created. Iteration runs until AGREE OR two consecutive rounds produce no new findings (loop-detection convergence; see OPERATING PROTOCOL → Iteration-to-agreement protocol → Nuclear-mode convergence). Use only for high-stakes work where you genuinely need exhaustive scrutiny — heavy mode typically uses ~300k Codex tokens; nuclear typically uses 1M+. Reviewer can ESCALATE at any round to surface a fundamental disagreement.
-
/peer review <task> — force review action. Mode = medium unless the user prefixed a mode.
-
/peer worker <task> — force worker action. Always creates a status file regardless of size estimate.
-
/peer preview <task> — show what would happen without calling anyone. Free. Prints the would-be Parsed: line + would-be prompt path + would-be peer + would-be effort/sub-agents.
-
/peer ask <question> — quick cross-check. NO iteration, NO status file. Dispatcher packages recent chat context (last ~3 user msgs + last AI response, ~8-10K tokens, expandable if the question references older content) from its own working memory into the prompt file, sends one prompt to the other AI, prints a 1-2 sentence stance from itself plus a verbatim copy of the other AI's full reply under ### Codex says: or ### Claude says:. No artifacts retained beyond the response file.
-
/peer status [N] — dashboard: ID + description + Status + last verdict + round for the last N tasks (default N=20; pass /peer status 40 etc. to see more). Reads .agent-work/status/*.md, sorted by mtime DESC. Use this to find a task ID and see the dashboard at the same time. Designed for between-tasks overview, NOT live mid-task progress. Claude Code serializes user messages — typing /peer status while a /peer heavy or /peer nuclear task is running queues it behind the active work (5-10+ minute wait). For live mid-task progress, open the task's status file directly in your editor (.agent-work/status/<task-id>.md); the AI updates it after every round, so it's the live dashboard. The status file's bare relative path (resolvable under chat CWD) is posted in chat at task start for right-click open.
Version banner (v1.6.1): every /peer status and /peer status <id> output begins with a one-line skill-version banner before the dashboard:
peer-skill v<VERSION> | install: <ISO timestamp> | update: <state>
<VERSION> — read from ~/.claude/skills/peer/VERSION (single line, e.g. 1.6). Fallback unknown if file missing.
<ISO timestamp> — read from ~/.claude/peer/install-timestamp.txt. Fallback not recorded if missing.
<state> — by default not checked (run /peer status --check-update). Network call skipped on plain /peer status so the dashboard renders instantly. With --check-update flag (see next bullet), state becomes one of: up to date, N commits behind origin/main (run /peer update), or not a git install (manual: replace ~/.claude/skills/peer/).
-
/peer status --check-update — same as /peer status but also performs a one-shot remote-update check: detect .git/ in ~/.claude/skills/peer/, run git -C ~/.claude/skills/peer fetch --quiet 2>/dev/null, compute git rev-list --count HEAD..@{u}. Updates the version banner's <state> field. Adds 1-3 seconds to the command. If the skill folder is not a git repo, banner reports not a git install and prints manual update instructions (per v2 §6 design). Dashboard rendering is unaffected. Flag also accepted by /peer status <id> and /peer status [N].
-
/peer status <id> — full status file rendered inline. Use for everything about one task. Same queueing caveat as /peer status — for live updates on the active task, open the file in your editor instead.
-
/peer resume <id> — pick a task back up after closing the app or starting a new chat. Reads .agent-work/status/<id>.md, finds the latest round entry, identifies open REVISE items not marked APPLIED/REJECTED/ACKNOWLEDGED, drafts a next-round prompt file in .agent-work/<peer>-prompts/ with those items, shows a preview, and asks the user to confirm before invoking the script.
-
/peer cancel <id> — stop a stuck task and clean up its sandbox. Confirmation required: print intended cleanup actions (worktree remove, branches delete, run file delete) and ask Confirm cancel + cleanup of <id>? (y/n). On confirm: git worktree remove --force + git branch -D for both <peer>/<branch> and <peer>-base/<branch> if a worker ran, remove .agent-work/runs/<id>.run if present, update status file Status: -> ESCALATE with a What I need from you entry: Task cancelled by user. Was at <last verdict>; review prompts/responses if you want to revisit. Status file stays in place.
-
/peer doctor — health check. Verifies prereqs (Claude/Codex/Python/git/CWD-at-project-root), checks runtime scripts at ~/.codex/tools/ AND runtime libs at ~/.codex/tools/lib/peer_lib/ (8 modules: __init__.py, common.py, sidecar.py, piece6_check.py, status_schema_check.py, status_update.py, process.py, settings_patcher.py) against canonical via byte-for-byte content hash (matches the U2/U5 refresh logic — flags any drift), checks the 6 wrapper commands at ~/.claude/commands/peer-*.md against canonical, reports manifest state at ~/.claude/peer/installed-files.txt (present + line count, or missing), peer-deny-bash status, active task count (from .agent-work/status/*.md where Status: in-progress), runtime of any in-flight peer calls via the PID registry at .agent-work/runs/ (see PID registry subsection below), and orphaned worktrees (folders under <home>/.peer-worktrees/<peer>/<project-id>/ with no matching live status file). Offers fix-it: re-run user-global phase to refresh drifted scripts/wrappers (--fix), clean up orphan run files, prune orphan worktrees (opt-in via --prune-worktrees since worktrees can hold uncommitted work). Standalone terminal entry point: py -3 ~/.codex/tools/peer-doctor.py — runs without queuing behind the active chat's work. Same queueing caveat as /peer status still applies if invoked via the slash command: typing /peer doctor mid-task waits behind the active work. Practical workaround: open a second Claude Code chat in the same project and run /peer doctor there, or use the standalone terminal entry point.
-
/peer help — cheat sheet. Grammar at the top, then commands grouped lightest->heaviest, with one valid example and one invalid example per command.
-
/peer config — show effective config (built-in < user < project; see Config layering subsection).
-
/peer config set <key> <value> — change a setting in project config (default) or user config (with --user). See settable-keys table.
-
/peer config reset <key> — remove the key from project config (default) or user config (with --user). Inheritance falls through to next layer.
-
/peer uninstall — typed-only (NO /peer-uninstall wrapper exists). Reads the install manifest at ~/.claude/peer/installed-files.txt, previews what will be removed, asks Type DELETE to confirm:, on confirm removes wrappers + runtime scripts + (live-checked) user config in that locked order. Prints final manual step (delete the skill folder yourself). Never touches any project's .agent-work/. See subsection 16 for full semantics.
7. Settable config keys
| Key | Controls |
|---|
default_mode | light / medium / heavy for bare /peer <task> |
force_agent_d | true / false — always include the security sub-agent, even on non-security tasks |
worker_threshold_minutes | Minutes before worker mode auto-triggers (medium default 15, heavy default 5) |
auto_status_file | true / false — force status file even on tasks below the 8-min threshold |
default_peer | claude or codex — which AI to call from bare commands |
claude_effort | Override Claude effort: mid / high / xhigh / max |
codex_effort | Override Codex effort: low / medium / high |
<mode>_budget | Project-wide token budget for the named mode (e.g., heavy_budget = 100000) |
Format: TOML. Top-level keys map directly. /peer help config shows this list compactly.
Nuclear-mode behavior (effort, mandatory caller-side A+B+C+D, 2-step gate, loop-detection convergence) is NOT user-tunable. Use /peer heavy if you want manual control over those knobs.
8. Risk detection (light -> medium upgrade)
Trigger keywords (case-insensitive, whole-word match where possible): auth, authn, authz, password, secret, token, key, credential, payment, migration, migrate, delete, drop, truncate, destroy, production, prod, deploy, pipeline. Plus diff-size threshold: if the task references a current diff or branch, >50 lines OR >5 files also fires.
When the trigger fires under light mode, the dispatcher prints:
Risk signal detected: <which keyword(s) or diff stats>.
Light mode is one-pass advisory; this task warrants iteration.
Upgrade to medium? (y/n, or `/peer light --force <task>` to bypass)
/peer light --force bypasses entirely.
Nuclear mode does NOT fire the risk-detection upgrade prompt (it is already at maximum effort and subagent count).
9. Tiny-task detection (heavy -> medium downgrade)
Trigger under heavy mode: single file, <10 lines diff estimate, no risk-keyword match. The dispatcher prints:
Tiny-task signal: <file>, ~<N> line diff, no risk keywords.
Heavy mode is overkill for this; medium handles it without subagents/extra rounds.
Downgrade to medium? (y/n, or `/peer heavy --force <task>` to bypass)
Nuclear mode also fires the tiny-task prompt (nuclear on a one-line typo is absurd). /peer nuclear --force <task> bypasses the tiny-task prompt but does NOT bypass the 2-step nuclear gate (subsection 14).
10. Argument parsing precedence + dispatcher echo
Before any tokens are spent on a peer call, the AI prints the parse result so misparses are visible. Required even for preview.
Parsed: mode=<x>, action=<y>, flags=[<...>], task="<...>"
For /peer nuclear, the dispatcher MUST run the 2-step nuclear gate (subsection 14) BEFORE printing the Parsed: line and creating the status file. Cancellation at the gate produces no artifacts. See subsection 14 for the full ordered sequence.
11. Config layering
Three layers, lowest priority first:
- Built-in defaults — embedded in this SKILL.md (mode -> effort table, default subagent rules, etc.).
- User config —
~/.claude/peer/config.toml (NOT under the skill folder, so it survives git pull of the skill). Persists across projects.
- Project config —
<project>/.agent-work/config.toml. Per-project overrides.
/peer config shows the effective merge. /peer config set <key> <value> writes to project config by default; --user flag writes to user config. /peer config reset <key> removes the key from project config (or --user removes from user config).
Example project .agent-work/config.toml:
default_mode = "heavy"
force_agent_d = true
heavy_budget = 100000
12. Prompt-language rule (dispatcher when auto-drafting prompts)
See protocol.md §Prompt-language rule.
13. PID registry (used by /peer doctor)
All four peer-call scripts write a one-file marker at .agent-work/runs/<task-id>.run at startup with these fields (one per line, key=value, UTF-8):
PID=<integer>
Peer=<claude-review|claude-worker|codex-review|codex-worker>
TaskID=<derived from InputFile basename>
StartTime=<ISO 8601 UTC>
InputFile=<absolute path>
WorktreeName=<safeName, worker scripts only>
TaskID derivation: from InputFile basename, strip .md suffix, strip leading two-letter prefix + dash (XP-/CP-/XR-/CR-/XW-/CW-), strip trailing -plan-r<digits> or -impl-r<digits> if present.
The scripts remove the run file on exit (success OR failure OR exception). /peer doctor reads .agent-work/runs/*.run, parses, checks PID liveness via Get-Process -Id <PID> (Windows) or kill -0 <PID> (POSIX). Live PID -> in-flight: <peer> <task-id> running for <elapsed>. Dead PID -> orphan: <peer> <task-id> (PID <pid> no longer exists). Doctor offers cleanup of orphan run files.
14. Nuclear gate (2-step confirmation)
When the dispatcher parses /peer nuclear <task> (with or without --force), it MUST execute the 2-step gate before any peer call, status-file update, or sub-agent spawn.
Order of operations for /peer nuclear invocations:
- Parse silently. Do NOT print the
Parsed: line yet. Determine mode = nuclear, action (default worker, or explicit), flags (including --force), task body.
- Tiny-task check. If the task is a single-file <10-line diff with no risk keywords AND
--force was not passed, print the tiny-task downgrade prompt:
Tiny-task signal: <file>, ~<N> line diff, no risk keywords.
Nuclear mode is overkill for this; medium handles it without subagents/extra rounds.
Downgrade to medium? (y/n, or `/peer nuclear --force <task>` to bypass)
- If user replies
y/yes, proceed as /peer medium <task> (skip the rest of the nuclear flow).
- If user replies
n or anything else, continue to step 3.
- 2-step nuclear gate. Run step 3a (warning + y/n) and step 3b (typed
NUCLEAR).
- Print
Parsed: line (only after the gate passes):
Parsed: mode=nuclear, action=<x>, flags=[<...>], task="<...>"
- Create status file at
<chat-cwd>/.agent-work/status/<task-id>.md per existing rules.
- Spawn 4 caller-side sub-agents (mandatory A+B+C+D) and begin plan iteration.
Step 3a — warning + y/n confirm. Print verbatim:
NUCLEAR MODE — uncapped iteration with maximum effort.
This will use Claude `max` effort and Codex `high` effort.
Multi-agent caller-side sub-agents A+B+C+D are mandatory (cannot disable).
Iteration runs until AGREE or until two consecutive rounds produce no new findings.
Heavy mode typically uses ~300k Codex tokens; nuclear typically uses 1M+.
Use only when you genuinely need exhaustive scrutiny.
Continue? (y/n)
If the user replies anything other than y/Y/yes/YES, abort with Nuclear cancelled. and exit. No status file is created.
Step 3b — typed confirmation. If step 3a returns y, print verbatim:
To confirm nuclear mode, type the literal word `NUCLEAR` (uppercase, no quotes):
If the user's reply, after trimming leading/trailing whitespace, is not exactly the 7-character string NUCLEAR, abort with Nuclear cancelled — confirmation phrase did not match. and exit.
If the reply matches, proceed to step 4 above.
Why typed confirmation: y/n prompts get muscle-memory-accepted. Typing NUCLEAR requires deliberate intent. The phrase is fixed in this version; not user-customizable.
--force does NOT bypass the gate. --force only bypasses the tiny-task downgrade prompt in step 2. The 2-step nuclear gate in step 3 is mandatory for every nuclear invocation.
No bypass for testing. Nuclear has no test/dry-run bypass. Use /peer preview nuclear <task> to see what nuclear would do without firing the gate (preview is a separate action that does not invoke any peer call).
15. Dropdown commands vs typed subcommands
Six commands have their own top-level slash command in Claude Code's autocomplete dropdown:
/peer-light, /peer-heavy, /peer-nuclear — mode shortcuts
/peer-help, /peer-status, /peer-doctor — display/utility shortcuts
These wrappers exist as files at ~/.claude/commands/peer-<X>.md (user-level). Each wrapper delegates to the main /peer skill — clicking /peer-heavy from the dropdown is functionally identical to typing /peer heavy <task>.
Wrappers DO NOT exist for:
- Action overrides (
review, worker, preview) — typed only.
- Task management (
resume, cancel) — typed only. (status has its own dropdown wrapper as /peer-status; tasklist was removed in v1.5 — /peer status is a strict superset and supports the [N] argument for "show more than 20".)
- Settings (
config, config set, config reset) — typed only.
- Cross-check (
ask) — typed only.
- Uninstall (
/peer uninstall) — typed only, intentionally NOT in the dropdown (don't want it accidentally clickable).
To invoke any non-dropdown command, type the full form: /peer review the auth flow, /peer status, /peer config set claude_effort max, /peer uninstall, etc. The main /peer parser handles all forms identically.
The skill's frontmatter argument-hint field points users at /peer-help for the full command list when they're browsing the dropdown.
16. /peer uninstall (typed-only, NOT in dropdown)
Invoke ONLY by typing /peer uninstall. The parser routes the literal token uninstall to this handler (added to the parser's known-subcommand list in subsection 1). There is no /peer-uninstall wrapper — by design. Don't add one.
Order of operations:
-
Read the manifest at ~/.claude/peer/installed-files.txt.
- If present and non-empty: continue to step 2 with the file list.
- If absent or empty: print:
No manifest found. Either /peer was never bootstrapped on this machine,
or the manifest was hand-deleted.
Without a manifest, automatic uninstall is unsafe (don't know what to remove).
Manual cleanup options:
1. Re-run /peer in any project to regenerate the manifest, then run /peer uninstall again.
2. Delete known paths manually:
~/.claude/commands/peer-light.md ~/.claude/commands/peer-heavy.md
~/.claude/commands/peer-nuclear.md ~/.claude/commands/peer-help.md
~/.claude/commands/peer-status.md ~/.claude/commands/peer-doctor.md
~/.codex/tools/ask-claude-review.py ~/.codex/tools/ask-claude-worker.py
~/.codex/tools/ask-codex-review.py ~/.codex/tools/ask-codex-worker.py
~/.codex/tools/peer-status.py ~/.codex/tools/peer-doctor.py
~/.codex/tools/lib/peer_lib/ (folder; 8 .py modules)
~/.claude/peer/ (folder)
~/.claude/skills/peer/ (folder; the skill itself)
Then exit. NO automatic deletion when manifest is missing.
-
Print preview block with source distinction explicit:
/peer uninstall — about to remove:
From manifest:
<wrapper paths from manifest>
<runtime script paths from manifest>
Live-checked (not in manifest):
~/.claude/peer/config.toml (only listed if present)
Will NOT remove:
~/.claude/skills/peer/ (manual final step)
Any project's .agent-work/ (per-project history)
-
Print: Type DELETE (uppercase, no quotes) to confirm:. If user reply, after trim, is not exactly the 6-character string DELETE, abort with Uninstall cancelled — confirmation phrase did not match. and exit. No state changes.
-
Iterate per the categorized order (locked decision: wrappers → runtime scripts → user config). Pseudocode:
manifest = read_manifest_or_empty()
wrapper_paths = [p in manifest if p matches ~/.claude/commands/peer-*.md]
runtime_paths = [p in manifest if p matches ~/.codex/tools/ask-*.py OR ~/.codex/tools/peer-{status,doctor}.py OR ~/.codex/tools/lib/peer_lib/*.py]
user_config_path = ~/.claude/peer/config.toml if Test-Path else None
for path in wrapper_paths:
try:
remove(path); print "removed " + path
manifest.remove(path); write_manifest(manifest)
except: print "warn: could not remove " + path
for path in runtime_paths:
try:
remove(path); print "removed " + path
manifest.remove(path); write_manifest(manifest)
except: print "warn: could not remove " + path
if user_config_path:
try:
remove(user_config_path); print "removed " + user_config_path
# NO manifest.remove() — user config was never in the manifest.
except: print "warn: could not remove " + user_config_path
remove(manifest_file); print "removed " + manifest_file
if dir_empty(~/.claude/peer/): remove_dir(~/.claude/peer/); print "removed ~/.claude/peer/"
-
Print final manual step:
Cleanup complete except the skill folder itself.
Run this manually to finish:
rm -rf ~/.claude/skills/peer/
Why typed DELETE: y/n prompts get muscle-memory-accepted. Same logic as the nuclear gate. The phrase is fixed; not user-customizable.
--force does NOT bypass the confirmation. No flag skips the typed DELETE.
If manifest is missing or partial (e.g., user manually deleted some files), per step 1 above the uninstall does NOT auto-delete — it prints a manual cleanup list and exits.
On first invocation (bootstrap)
The bootstrap has TWO phases:
- User-global phase (always runs, fast, idempotent). Prereq checks, install/refresh runtime scripts at
~/.codex/tools/, install/refresh 6 wrapper commands at ~/.claude/commands/, write/update the manifest at ~/.claude/peer/installed-files.txt. These are user-level installs that affect every project. Runs SILENTLY on every /peer invocation. No user prompt; idempotent (no-op when state is current).
- Project-local phase (runs only on first
/peer invocation in a project). Create .agent-work/ folder layout, scan CLAUDE.md/AGENTS.md for deny patterns, write .agent-work/peer-deny-bash.txt. Triggers the announcement+y/n confirmation below. Runs at most once per project (subsequent /peer calls in the project skip this phase entirely).
The user-global phase is fast (<1s) when scripts/wrappers/manifest are already current. The slow first-time npx @openai/codex download still happens during the user-global phase prereq check, but only on the very first /peer invocation across all projects on this machine.
Pre-step — Detect state and dispatch
For every /peer invocation:
- ALWAYS run the user-global phase silently first (steps U1-U4 below). The phase is content-hash idempotent — when scripts/wrappers/manifest already match canonical, U2/U3/U4 are fast no-ops. When anything has drifted (canonical updated by
git pull, hand-edited script, missing wrapper, etc.), U2/U3 detect the mismatch and refresh. Skipping the user-global phase based on file existence alone would let drifted content go undetected — file-exists is necessary but not sufficient. No announcement, no prompt; runs silently regardless of project state.
- After the user-global phase completes, check project bootstrapped? =
<project>/.agent-work/ exists?
- Dispatch:
- Project bootstrapped: print one-line
/peer skill is active for this chat and proceed to /peer command parsing.
- Project NOT bootstrapped AND command requires project-local state: print announcement below + ask
(y/n). On y, run project-local phase (P1-P3). On n, abort.
- Project NOT bootstrapped AND display command (see rule below): proceed directly to /peer command parsing with empty project state. No announcement.
Display commands skip the project-local prompt. The commands /peer help, /peer status, /peer status <id>, /peer doctor are read-only display commands. They do NOT trigger the project-local phase prompt. If .agent-work/ doesn't exist when a display command runs, render with empty state (e.g., /peer status shows No tasks yet — run /peer <task> to start.). The user-global phase still runs silently.
Other commands DO trigger the project-local prompt when .agent-work/ is absent: /peer <task>, /peer light/medium/heavy/nuclear <task>, /peer review/worker/preview <task>, /peer resume <id>, /peer cancel <id>, /peer ask <question>, /peer config set/reset (when writing project layer), /peer uninstall (which doesn't need .agent-work/ but still benefits from a configured manifest — see subsection 16).
Project-local first-time announcement (render verbatim when project-local phase will run)
Render the following as markdown (do NOT wrap in a code fence; the user should see formatted text, not a code block):
First-time /peer setup for this project.
User-global setup just completed silently — prereq checks passed; runtime scripts at ~/.codex/tools/, the 6 dropdown wrappers at ~/.claude/commands/, and the manifest at ~/.claude/peer/installed-files.txt are now in place or refreshed. If this was the very first /peer invocation on this machine, the npx Codex CLI download added 1–3 min to that step.
About to do (gated by your confirmation below):
- Create
.agent-work/ folder layout in this project — instant
- Scan
CLAUDE.md / AGENTS.md for paid-pipeline deny patterns — instant
Total time: instant.
Continue? (y / n)
If user replies anything other than y/Y/yes/YES, abort with Bootstrap cancelled. Re-run /peer when ready. (Note: user-global setup at ~/.codex/tools/, ~/.claude/commands/, and ~/.claude/peer/ already completed silently — only the project-local .agent-work/ creation was skipped.) and exit. No project-local artifacts created.
User-global phase (steps U1–U4)
U1 — Prereq checks (v1.6: action-scoped per Piece 3)
Prereqs split into two tiers. Tier-1 always required; Tier-2 only required when the dispatched command runs the worker action (worker by default for /peer-nuclear; via auto-trigger for /peer-heavy >5min + clean boundary; via auto-trigger for /peer-medium >15min + clean boundary + verification gate; or via explicit /peer worker).
Tier 1 — always required (every command including review/display):
- Claude Code CLI on PATH:
claude --version
- Codex CLI invokable:
npx -y "@openai/codex" --version (one-time download is fine; the package is scoped — the unscoped codex is a different/unrelated package).
- Python 3.9+ invokable:
py -3 --version (Windows) or python3 --version (macOS/Linux). The four ask-*.py scripts and the peer-{status,doctor}.py standalones are Python; nothing else in the runtime path requires PowerShell.
~/.codex/tools/ exists or can be created.
Tier 2 — required only for worker action / worker-auto-trigger commands:
5. Chat CWD equals the project root. Status-file clickable links and the worker's git-worktree creation both require this. If CWD ≠ project root for a worker invocation, abort with: Open a fresh chat in the project root and re-invoke /peer. For review-only invocations from a non-root chat, warn and proceed.
6. Project is a git repo: git status returns without error. (Worker mode uses git worktree add; review mode does not.)
Display commands (/peer-help, /peer-status, /peer-doctor) bypass both tiers entirely: they render with empty state if prereqs aren't met. /peer-doctor is special-cased: it runs the full prereq scan (both tiers) but treats failures as DIAGNOSTIC findings to report, never as fatal aborts. That is doctor's whole purpose — surface problems without blocking.
Review-only invocations (explicit /peer review, /peer ask, light/medium without worker auto-trigger): require Tier 1; treat Tier 2 as warning-only.
Worker invocations (explicit /peer worker, OR mode/auto-trigger fired): require both tiers as fatal.
U2 — Refresh runtime scripts at ~/.codex/tools/
For each of the 6 canonical scripts at ~/.claude/skills/peer/scripts/ask-{claude,codex}-{review,worker}.py and ~/.claude/skills/peer/scripts/peer-{status,doctor}.py:
- Compute destination at
~/.codex/tools/<name>.py.
- If destination missing → write canonical content. Done.
- If destination present → compare canonical content to destination content (byte-for-byte SHA-256 with CRLF/LF + BOM normalization, see
peer_lib.common.sha256_normalized). If identical → no-op. If different → backup destination as <name>.py.bak.<YYYY-MM-DD-HHmmss> then overwrite with canonical content.
Content-based refresh — repeat invocations always end with destination = canonical. Backups are kept (no auto-cleanup; user can prune manually). Timestamp is to-the-second so multiple same-day overwrites don't replace earlier backups.
U3 — Install/refresh wrapper commands at ~/.claude/commands/
For each of the 6 wrapper templates at ~/.claude/skills/peer/commands/peer-{light,heavy,nuclear,help,status,doctor}.md:
- Compute destination at
~/.claude/commands/<name>.md.
- If destination missing → write canonical. Done.
- If destination present and identical to canonical → no-op.
- If destination present and different → overwrite (no backup; wrappers are mechanical glue, never user-edited).
Wrapper files do NOT get backups — they're auto-generated stubs and the user shouldn't have edits worth preserving. (Runtime scripts in U2 DO get backups in case the user has hand-modified one.)
U4 — Write/update manifest at ~/.claude/peer/installed-files.txt
Create ~/.claude/peer/ directory if it doesn't exist. Write the manifest:
- One absolute path per line, UTF-8, LF endings.
- Include: each wrapper file path written in U3 + each runtime script path written in U2.
- DO NOT include
~/.claude/peer/config.toml (live-checked at uninstall time, not manifest-tracked).
- Sorted lexicographically for diff stability.
- Overwrite existing manifest (manifest reflects current install state).
Example contents (10 lines total):
~/.claude/commands/peer-doctor.md
~/.claude/commands/peer-heavy.md
~/.claude/commands/peer-help.md
~/.claude/commands/peer-light.md
~/.claude/commands/peer-nuclear.md
~/.claude/commands/peer-status.md
~/.claude/hooks/peer-route.py
~/.codex/tools/ask-claude-review.py
~/.codex/tools/ask-claude-worker.py
~/.codex/tools/ask-codex-review.py
~/.codex/tools/ask-codex-worker.py
~/.codex/tools/peer-doctor.py
~/.codex/tools/peer-status.py
~/.codex/tools/lib/peer_lib/__init__.py
~/.codex/tools/lib/peer_lib/common.py
~/.codex/tools/lib/peer_lib/piece6_check.py
~/.codex/tools/lib/peer_lib/process.py
~/.codex/tools/lib/peer_lib/settings_patcher.py
~/.codex/tools/lib/peer_lib/sidecar.py
~/.codex/tools/lib/peer_lib/status_schema_check.py
~/.codex/tools/lib/peer_lib/status_update.py
(Use absolute resolved paths in the actual file — e.g., C:\Users\<name>\.claude\commands\peer-doctor.md. The ~-prefix above is for documentation only.)
U5 — Install plan-mode-override hook + Piece 6 lib (v1.6, Pieces 5 + 6 + 7)
The peer-call workflow's correctness depends on three new install-time artifacts:
Hook installation:
- Copy canonical
~/.claude/skills/peer/hooks/peer-route.py to runtime ~/.claude/hooks/peer-route.py (create directory if needed; idempotent content-hash check, same backup pattern as U2).
- Patch
~/.claude/settings.json to register the hook under hooks.UserPromptSubmit. JSON-aware patch (read-parse-merge-write, not text replace) — use peer_lib.settings_patcher.patch_hook_command(). Idempotent: skip if any entry under hooks.UserPromptSubmit[*].hooks[*].command already contains the substring peer-route.py. The substring peer-route (no extension) matches both .ps1 and .py for transition compatibility. If absent, append a platform-appropriate command (Windows: py -3 "<path>" preferred for reliability over python.exe which is the WindowsApps Store redirector; mac/Linux: /usr/bin/env python3 "<path>"):
{ "matcher": "", "hooks": [{ "type": "command", "command": "py -3 \"<runtime-path>\"" }] }
Lib installation (Piece 6 + Piece 7):
3. Copy canonical ~/.claude/skills/peer/scripts/lib/peer_lib/*.py to runtime ~/.codex/tools/lib/peer_lib/*.py (create directory; same content-hash + backup pattern as U2). Files: __init__.py, common.py, sidecar.py, piece6_check.py, status_schema_check.py, status_update.py, process.py, settings_patcher.py (8 modules total). The 4 ask-*.py + 2 standalone (peer-status.py, peer-doctor.py) scripts insert <scripts-dir>/lib onto sys.path and import peer_lib, so co-locating the package alongside scripts is required.
Install timestamp (Piece 6 backward-compat):
4. Write ~/.claude/peer/install-timestamp.txt with current ISO 8601 UTC timestamp on FIRST install (do not overwrite on subsequent runs — the timestamp anchors the artifact-age backward-compat rule for sidecarless proposals). Use Test-Path + create only if missing.
U4 manifest update (v2 mac-port): the manifest at ~/.claude/peer/installed-files.txt now lists 21 files: 6 wrappers + 4 ask-*.py + 2 standalone (peer-status.py, peer-doctor.py) + 1 hook (peer-route.py) + 8 lib (peer_lib/{init,common,sidecar,piece6_check,status_schema_check,status_update,process,settings_patcher}.py).
~/.claude/peer/install-timestamp.txt and ~/.claude/settings.json are NOT manifest-tracked — they're live-checked at uninstall time (timestamp file removable; settings.json entry surgically removed via JSON-aware filter that drops entries containing the substring peer-route — matches both peer-route.ps1 legacy and peer-route.py v2).
Source-file encoding (v2 mac-port)
The runtime is now Python (.py); Python 3 reads source as UTF-8 by default and tolerates non-ASCII content (em-dash, smart quotes, ellipsis, etc.) without the PowerShell-5 ANSI-fallback parser breakage that motivated the legacy .ps1 ASCII-only rule. No source-file character restriction is enforced. The legacy .ps1 ASCII rule is retained for historical context only — see MIGRATION-v2-MAC.md.
Pieces 4, 6, 7 — script behavior summary (v1.6, v2-ported)
The 4 ask-{claude,codex}-{review,worker}.py scripts import from peer_lib and provide:
- Piece 4 (Research trail): scripts append a fixed "## Research trail" output-format directive to every prompt before invoking the underlying CLI. Peer responses begin with a Research trail block listing files read, commands run, sub-agent contributions (R-1/R-2/R-3 for review or A/B/C/D for worker; 7 total roles across action types). Token-count self-reporting omitted.
- Piece 6 (Round-gate enforcement): review/worker scripts gain
--worker-proposal-path, --acknowledge-abort, --acknowledge-abort-confirm parameters. Before invoking the CLI, scripts call peer_lib.piece6_check.invoke_piece6_check which: validates round-number sanity, checks cross-phase REVISE/ESCALATE state, validates worker proposal coverage via the sidecar mechanism, refuses to fire on mid-round REVISE without proper APPLIED/REJECTED/ACKNOWLEDGED markers in the new prompt. Override path requires both flags + reason ≥ 60 chars passing denylist + positive-evidence regex + slug-confirm + hard-cap of 1 use per task slug.
- Piece 6 (Sidecar lifecycle): worker scripts write
<peer>W-task.meta.json atomically alongside the proposal at submission. Review scripts auto-inject proposal contents (with prompt-injection-hardened wrapper preamble + delimited content + reassertion) and update sidecars on response success (TOCTOU defense: validate current SHA still matches the SHA at injection time before writing reviewed_in).
- Piece 7 (Status auto-update): scripts append a round-log entry to the task's status file on response success. Header fields stay caller-owned; round log becomes append-only by the script. Eliminates the "forgot to update status file" class of bugs.
- Piece 9 (Status-file schema gate): scripts call
peer_lib.status_schema_check.invoke_status_file_schema_check immediately after the Piece 6 round-gate. If the task's status file exists but does not match the canonical schema (top heading # Status: <task-id>, all 6 required **Field:** paragraphs blank-line-separated in canonical order, ## Active artifacts, ## Round log, ESCALATE-conditional ## What I need from you), the script refuses to invoke the peer with a [Status Schema BLOCK] error and exit 2. Tolerant of BOM, CRLF/LF, trailing whitespace, multiple blank lines. If the status file does not exist, the existing [STATUS-FILE REMINDER] banner handles it; the validator does not fire. Closes the "AI improvises a non-canonical status file structure" failure mode that motivated v1.6.3. The bootstrap installs .agent-work/status/.template.md (per P1 substep below) so callers can cp .template.md <task>.md instead of generating from memory.
Project-local phase (steps P1–P3)
Runs only after the announcement+y/n confirmation passes (or skipped entirely for display commands).
P1 — Create the per-project folder layout
From the project root, create:
.agent-work/
claude-prompts/ claude-responses/ claude-workers/
codex-prompts/ codex-responses/ codex-workers/
notes/ results/ runs/ status/ templates/
Add .agent-work/ to .gitignore if not already present.
Status-file starter template (Piece 9, v1.6.3+): install/refresh .agent-work/status/.template.md from the canonical content embedded below using a content-hash + backup pattern (same convention as U2/U5 lib refresh). On each /peer invocation: if the file is absent, write the canonical content. If present and identical, no-op. If present and different, back up to .template.md.bak.<YYYY-MM-DD-HHmmss> and overwrite. This keeps every project's template current as the canonical schema evolves, while preserving any user-customized version as a backup. Callers create new status files via cp .agent-work/status/.template.md .agent-work/status/<task-id>.md and edit, instead of generating from memory (which is the failure mode Piece 9 closes).
Canonical .template.md content (write verbatim — this is the schema the validator enforces):
# Status: <task-id>
**Description:** <one-line task description>
**Status:** in-progress
**Phase:** plan
**Round:** plan round 1 of 3
**Last verdict:** (none yet)
**Codex tokens (cumulative):** 0
---
## Active artifacts
(Direct file links the user can right-click open. Update as new artifacts appear.)
- (populate as prompt/response files appear)
---
## Round log
(Round entries appended automatically by the peer-call scripts after each successful response. Caller may also add manual notes between rounds.)
Worktrees are NOT under .agent-work/. Peer worktrees live at <home>/.peer-worktrees/<peer>/<project-id>/<branch-name>/. The worker scripts create those folders on first run. They live outside the main repo so the Claude worker's deny rule for Write($mainPath/**) doesn't block Claude from editing its own sandbox.
If the project needs additional bash tools BEYOND the default allowlist (e.g., bun, deno, zig, stack, custom internal CLIs), the user can create .agent-work/peer-allow-bash.txt with one Claude permission rule per line. The Claude worker reads this file and appends its lines to the allowlist. The default allowlist already covers Node/JS/TS, Python, Rust, Go, Ruby, .NET, Java, make, and common linters; only add to the allow extension when the default doesn't cover what the project needs.
P2 — Auto-populate .agent-work/peer-deny-bash.txt (only if absent)
Skill mode does NOT write rules into the project's CLAUDE.md or AGENTS.md — the rules live in this SKILL.md, single source of truth. But the project's own CLAUDE.md / AGENTS.md may still mention paid-pipeline / deploy commands worth blocking. Scan them and write .agent-work/peer-deny-bash.txt with the detected patterns. Only write if the file does not already exist — the user may have hand-edited it to add patterns the auto-detector cannot find.
Scan scope: CLAUDE.md and AGENTS.md only.
Extraction contexts (a candidate command must appear in at least one):
- Inline backticks —
`cmd ...`
- Fenced code blocks — triple-backtick blocks
- Same-line or immediately-next-line command following one of these literal phrases (case-insensitive):
do not run, don't run, production deploy, do not invoke, paid pipeline, never run
Pattern set (literal command head; tail args ignored): wrangler, npx wrangler, pnpm run deploy, pnpm deploy, npm run deploy, npm deploy, pnpm run pipeline, pnpm pipeline, npm run pipeline.
Detection collection. Each detection is a tuple (normalized_pattern, source_file, line_number, trimmed_context):
normalized_pattern — Bash(<command-head>:*) form
source_file — basename: CLAUDE.md or AGENTS.md
line_number — 1-based line in source file
trimmed_context — collapse internal whitespace to a single space; strip leading/trailing whitespace; truncate to 80 characters max (no ellipsis); no locale-dependent transforms.
Dedupe + sort + emit: Group detections by normalized_pattern. Within each group, sort by source_file (CLAUDE.md before AGENTS.md, lexicographic if a third file is ever added) then line_number ASC. Emit one full-line comment per detection in sort order: # detected in <source_file>: <trimmed_context>, then the single rule line Bash(<cmd>:*), then a blank line separator before the next group. Groups themselves emitted in alphabetic order of normalized_pattern. Comments are ALWAYS full-line and immediately ABOVE the rule. Never inline (the loader at runtime only strips lines starting with #; an inline comment would be appended into the deny entry and break matching).
Empty case (no patterns detected): write the file with header + placeholder so the loader skip is benign:
# Auto-generated by /peer skill. Project-specific deny patterns appended below this header.
# Add project-specific deny patterns here, one per line, e.g. Bash(wrangler:*)
Idempotency guarantee: identical CLAUDE.md/AGENTS.md inputs always produce a byte-identical output file. On subsequent invocations, this step is a no-op.
P3 — Smoke test (skip by default)
Run smoke tests ONLY if the user's invocation message, lowercased, contains any of these EXACT substrings: smoke test, with smoke test, run smoke test, test it, verify install. If none appear, skip and report smoke tests skipped (no trigger phrase in invocation).
When triggered, run a one-round smoke in the available direction. From the project root:
Claude reviewer smoke (works only if running from a Codex chat):
printf "Reply with 'claude reviewer reachable' and end with VERDICT AGREE.\n" > ".agent-work/claude-prompts/CP-smoke.md"
py -3 "~/.codex/tools/ask-claude-review.py" --input-file ".agent-work/claude-prompts/CP-smoke.md"
Codex reviewer smoke (works only if running from a Claude chat):
printf "Reply with 'codex reviewer reachable' and end with VERDICT AGREE.\n" > ".agent-work/codex-prompts/XP-smoke.md"
py -3 "~/.codex/tools/ask-codex-review.py" --input-file ".agent-work/codex-prompts/XP-smoke.md"
Confirm the output ends with ## VERDICT\nAGREE. Skip worker smokes unless the project's git repo is in a state where a throwaway worktree is safe.
Report (always — runs after whichever phases ran)
Tell the user:
- User-global phase actions: which scripts/wrappers were created vs kept vs refreshed (including backup paths for any runtime-script overwrites).
- Project-local phase actions (if it ran): folders created,
.agent-work/peer-deny-bash.txt state.
- Manifest path:
~/.claude/peer/installed-files.txt (with line count).
- Smoke test results, OR
smoke tests skipped (no trigger phrase in invocation).
- Confirm:
/peer skill is now active for this chat — workflow rules apply for the rest of the conversation.
REFERENCE (may be lost after compaction)
The remaining sections are full reference material. Compaction may drop them; the operating protocol above is what must survive. If you find yourself uncertain about a fanout detail, Windows pitfall, or known issue and the section below is gone, re-load this skill.
Multi-agent fanout
See protocol.md §Multi-agent fanout roles for shared role definitions (R-1/R-2/R-3 reviewer angles, A/B/C/D worker angles, caller-side parallel work, user override, over-fan-out signal).
Claude-specific implementation:
- Claude worker spawns sub-agents via the
Task tool (allowed in --allowedTools).
- Codex worker spawns sub-agents via internal
collab: SpawnAgent operations, enabled by features.multi_agent = true in ~/.codex/config.toml. No CLI flag is needed — the capability is model-level, not invocation-level.
Codex framework warning to ignore: during fanout, Codex sometimes prints ERROR codex_core::tools::router: error=Full-history forked agents inherit the parent agent type, model, and reasoning effort; omit agent_type, model, and reasoning_effort, or spawn without a full-history fork. This is a Codex internal config-shape complaint, not a fanout failure. To confirm fanout actually fired despite the warning, look for collab: SpawnAgent lines in the script's stdout. If those lines appear, the sub-agents ran. See Known issues / O-8.
Loop-prevention still applies to caller-side fanout. A caller-side sub-agent that itself makes a peer call goes through the script, which sets PEER_CALL_DEPTH. The deepest-level peer cannot spawn another peer call. Standard nesting limit holds.
Common Windows pitfalls
These are operational gotchas that have come up in practice. None block the workflow, but knowing about them avoids confusion when reading script output.
- Prompts must be UTF-8. All four scripts read prompt/response files as UTF-8. If you write prompts via tools that default to the system code page (Windows-1252 on most US installs), em dashes, smart quotes, and other non-ASCII characters mojibake into
??? when piped to the peer CLI. Stick to UTF-8-encoded prompt files.
[Console]::OutputEncoding=... error in Codex output is harmless. Codex CLI shells out to PowerShell internally on Windows and prepends a [Console]::OutputEncoding=[System.Text.Encoding]::UTF8; line to set up UTF-8 output. Under PowerShell Constrained Language mode the property assignment is blocked and emits PropertySetterNotSupportedInConstrainedLanguage. The error is cosmetic — Codex's actual response is unaffected. Ignore.
- Trailing
failed to record rollout items: thread <id> not found is harmless. Codex CLI sometimes emits this ERROR codex_core::session: ... line at the end of a run. It refers to Codex's internal session-recording layer and appears after the response is already produced. Not a workflow failure.
- PowerShell console renders
? for em-dash and curly quotes; the underlying file is fine. When you Get-Content a UTF-8 response file in a default PowerShell console, characters like —, ', ', ", " come out as ? because the host's output encoding is the system code page. The file bytes on disk are still valid UTF-8 — verify with py -3 -c "open(r'<path>','rb').read()" if uncertain. Do NOT chase this as an encoding bug from console rendering alone; reproduce from raw file content first.
Known issues (open, do NOT auto-fix without triage)
These are real issues observed during workflow self-tests where the right fix is not yet clear. Listed here so they don't disappear into chat history; investigate when convenient, but do NOT apply blind fixes — each one may have surprising root causes that a quick patch would mask.
- O-1 — Claude Code's
Write tool silently normalizes U+2019 (') to ASCII ' on file writes. Em-dashes (—, U+2014) and curly double quotes survive untouched; only the curly apostrophe is affected. Workaround when an exact curly apostrophe is required in a prompt: use Python open(<path>, 'w', encoding='utf-8') with the literal ' to write the file directly, or patch the file post-hoc the same way. Do not assume Write preserves it.
- O-7 — Intermittent UTF-8 mojibake in Codex reviewer responses. During a comprehensive workflow self-test, 1 of 4 reviewer calls produced a response file whose curly chars were all replaced with literal
? in the file bytes — not just rendered that way in the console, but actually written to disk that way. The other 3 of 4 responses (same script, same prompt structure, same UTF-8 prompt inputs) preserved everything cleanly. Causation is unproven. Recommend an isolated repro before assuming a fix.
- O-8 — Codex multi-agent fork prints a config-shape warning, fanout fires anyway. Worker stdout includes
ERROR codex_core::tools::router: error=Full-history forked agents inherit the parent agent type, model, and reasoning effort; omit agent_type, model, and reasoning_effort, or spawn without a full-history fork. immediately before the collab: SpawnAgent lines. The fanout completed and produced the expected A/B/C/D synthesis in the proposal's Provenance section. Verification rule: if you see this warning, scan stdout for collab: SpawnAgent lines to confirm fanout fired before assuming single-agent fallback. If SpawnAgent lines are absent, fanout silently degraded.
Future hardening
- The
peer-deny-bash.txt auto-detector could also scan package.json scripts (deploy, pipeline, etc.) for additional deny patterns. Not implemented; Parker's locked scope is CLAUDE.md / AGENTS.md only.
- The Claude worker has a much richer per-tool deny list than the Codex worker because the Claude CLI exposes
--allowedTools / --disallowedTools natively. Codex relies on --sandbox + system-prompt rules. If Codex CLI later exposes finer-grained tool denies, update scripts/ask-codex-worker.py accordingly.
- The MUST-call trigger list is intentionally short. Every additional trigger increases token cost. Tune by removing triggers that fire on tasks where peer review consistently adds no value.
- The 3-round cap is a backstop, not a target. Most calls should resolve in 1-2 rounds. If round 3 is hitting often, the prompts are unclear or the task is too big for one peer call — split it.