Run any Skill in Manus with one click

auto-approve-architecture

Internals of Leap's Claude auto-approve flow and the CLI state machine (CLIStateTracker)- the PermissionRequest hook, the AskUserQuestion exclusion, per-session auto_send_mode isolation and pin-file robustness, up/down arrow handling during dialogs and slash-command pickers, the on-input no-reset rule, and the TUI-menu fallback. Use this when modifying auto-approve behavior, the state tracker, hook handling in leap-hook-process.py, or Claude permission/dialog detection.

Run Skill in Manus

Stars43

Forks2

UpdatedJune 8, 2026 at 15:24

Source

Nevo24

Nevo24/leap

View GitHub Repository View Creator Repositories

Install command

Download

Run Skill in Manus

SKILL.md

readonly

Auto-Approve Architecture (Claude)

ALWAYS-mode auto-approve has two layers — the primary is hook-based and never renders a dialog; the fallback is the legacy TUI-menu path that types "1\r" into a rendered prompt.

Primary: PermissionRequest hook (hook_script auto_approve). Configured in ClaudeProvider.configure_hooks with matcher ^(?!AskUserQuestion$).* (every tool EXCEPT AskUserQuestion — see "AskUserQuestion exclusion" below). The hook script handler _handle_auto_approve() in leap-hook-process.py reads the session's auto_send_mode from .storage/pinned_sessions.json[tag] (with global fallback to .storage/settings.json) and, in ALWAYS mode, emits {"hookSpecificOutput": {"hookEventName": "PermissionRequest", "decision": {"behavior": "allow"}}} to stdout, then sys.exit(0) so the trailing print('{}') in __main__ doesn't append a second JSON object after the decision. SystemExit inherits from BaseException (not Exception), so it propagates past the except Exception block and the trailing print('{}') never runs — leaving exactly one JSON object on stdout. PAUSE mode returns normally so the trailing print('{}') runs, telling Claude "no decision" so the dialog renders normally.

AskUserQuestion exclusion. AskUserQuestion is the one tool whose entire purpose is to elicit a user choice. If PermissionRequest returns "allow" for it, Claude interprets that as "skip user interaction" — the question dialog is never rendered and the tool returns an empty answer set to the model ("Allowed by PermissionRequest hook" with no selections), corrupting the very flow the user invoked it for. The negative-lookahead matcher excludes the exact tool name AskUserQuestion so its PermissionRequest goes unanswered, Claude renders the dialog, and the user actually picks. Pinned by test_claude_permission_request_matcher_excludes_ask_user_question.

Per-session isolation of auto_send_mode. The hook fallback (per-tag pin → global → 'pause') is the protocol the hook script uses, but in steady state the global fallback is almost never hit — four guarantees in the server + monitor keep per-tag pins authoritative:

Snapshot at LeapServer.__init__ (server.py:134-145). Resolves global → per-tag pin and writes the result back into the pin BEFORE CLIStateTracker initialises, so later changes to the Settings dialog's global default can't retroactively flip this session's hook behavior.
set_auto_send_mode handler writes ONLY the per-tag pin — never the global (server.py:582-602). The original cross-session leak ("I toggled one session and all of them auto-approved") came from a stray save_settings(...) in this handler; per-session toggles must stay per-session. Pinned structurally by test_save_settings_not_imported_by_server and behaviourally by test_handler_does_not_modify_settings_file in test_auto_send_mode_persistence.py.
SessionMixin._merge_sessions preserves auto_send_mode in pin_data by pulling it from the live server's status response (s.get('auto_send_mode')) — without this, the monitor's first auto-pin write for a brand-new session would build pin_data from a stale in-memory cache that lacked the field, blowing away the server's snapshot. Pinned by test_merge_sessions_auto_send.py.
All monitor-side pin writes are per-tag (no save_pinned_sessions(self._pinned_sessions) full-state saves anywhere). Three targeted helpers in monitor/pr_tracking/config.py — update_pinned_session_field(tag, field, value), write_pinned_session_entry(tag, entry), remove_pinned_session_tag(tag) — each does a read-modify-write that touches only the requested tag's entry. The write_pinned_session_entry helper additionally treats auto_send_mode as server-owned: disk's value always wins over the caller's (possibly stale) in-memory copy, so a monitor refresh that ships an old auto_send_mode in pin_data can't clobber a fresh server-side toggle. Pinned by test_set_auto_send_mode_isolation.py and test_pinned_sessions_corruption.py. Residual narrow race: if a different writer mutates a different tag between this helper's read and write (~5–10ms window), that writer's change is lost; full elimination would need fcntl.flock — not currently in. All three helpers (and the server's symmetric _save_pinned_auto_send_mode) also have a corrupt-disk recovery path: a malformed pin file is treated as empty on read so the next write produces a valid JSON file again, restoring the self-healing behavior the pre-fix save_pinned_sessions had.

Pin-file readers on the __init__ critical path (_load_pinned_auto_send_mode, validate_pinned_session, build_auth_fetch_url, load_settings, load_pinned_sessions) catch (OSError, ValueError) rather than (OSError, JSONDecodeError) — UnicodeDecodeError is a ValueError and the narrower except let a malformed pin file crash session startup. They also isinstance-guard non-dict roots / entries and validate auto_send_mode is one of (PAUSE, ALWAYS) before propagating into CLIStateTracker. Tested in test_pinned_sessions_corruption.py.

↑/↓ during mid-RUNNING dialogs and slash-command pickers. Two distinct classes of in-CLI UI keep the state at RUNNING while ↑/↓ need to reach the CLI:

AskUserQuestion's question dialog fires no Notification hook (it's a built-in tool, not an MCP elicitation), so state stays RUNNING until the 5 s cursor+silence fallback flips it to NEEDS_PERMISSION.
Slash-command pickers (/resume, /mcp, /agents, /config, /effort, /model, /memory, /login, /doctor, /usage, /bug, /permissions, …) fire no hook at all and leave state in RUNNING for the entire time the picker is open.

In both cases the server's input filter (server.py:3506) would normally see ↑/↓ as RUNNING-state arrows and steal them for history recall, leaving the user unable to navigate the picker. The fix is a screen check: the input filter calls CLIStateTracker.screen_has_active_dialog() and passes ↑/↓ through whenever it returns True. Two complementary predicates make up the check:

provider.is_dialog_certain(tail_compact) — strict permission-dialog footer (Entertoselect + Esctocancel in the compact form of the last 5 non-blank rows) or numbered-menu cursor (❯1.). Kept strict because the same predicate gates state transitions where false positives stick state in NEEDS_PERMISSION for 60 s.
not provider.is_idle_prompt_visible(filled_rows) — structural detection of Claude's standard idle input box: a ─ HR row immediately followed by a ❯ input row, within the last _IDLE_TAIL_WINDOW non-blank rows (HR rows must be ≥_MIN_HR_LEN (40) chars and contain only ─/whitespace, so inline ── widgets like the /effort slider axis — which carries a ▲ — are rejected). A closing bottom ─ HR is present on some Claude builds and absent on others (the footer sits directly under the input row, with no second rule), so it is not required — only the top-HR→❯ pairing. When that pairing is gone from the bottom of the screen something is taking it over — a slash-command picker, the trust dialog, a permission dialog that didn't match the strict footer — and ↑/↓ belong to that something, not to history recall. Intentionally structural so new Claude pickers added next month work without us enumerating their footer text. Falls back to True (assume idle visible) when the screen has fewer than _IDLE_DETECT_MIN_ROWS non-blank rows, so transient / boot-time screens preserve the legacy strict-dialog-only behaviour.

Non-Claude providers (Codex / Cursor / Gemini) inherit is_idle_prompt_visible defaulting to True, so that leg is a no-op for them — but see the generic detector next, which is what makes their dialogs work.

Generic selection-dialog detection (all providers). screen_has_active_dialog() calls provider.screen_shows_selection_dialog(filled) first, before the dialog_patterns short-circuit — a CLI-agnostic detector (base CLIProvider) that fires on a numbered ›/❯/▶ selection cursor (› 1.) or a footer line carrying confirm/cancel/navigate hints (esc to cancel / enter to confirm / ↑/↓ to navigate) that looks like a footer — ≥2 distinct hints, a footer separator (·/•), or a short (≤40-char) hint-only line — rather than a long prose sentence that merely quotes one hint phrase. The footer check is cursor-glyph independent (it does NOT require a ›/❯), so it catches pickers whose selection marker is something else (Gemini/Cursor, future CLIs). A bare ›/❯ is still insufficient (it appears in idle prompts — Codex's ghost-text hint, Claude's ❯), which is why the cursor leg requires a numbered option. This is what lets ↑/↓ navigate Codex dialogs (empty dialog_patterns → the old short-circuit returned False and arrows were stolen for history recall — the "stuck arrows in a Codex multi-option dialog" report) and hardens Gemini/Cursor non-permission pickers, without each provider enumerating its footers. Because screen_has_active_dialog() is consumed only by the ↑/↓ input filter, a false positive is cheap (the arrow just reaches the CLI's native handling). Claude's is_dialog_certain + structural is_idle_prompt_visible path still runs after, unchanged. Pinned by TestScreenHasActiveDialog in test_state_tracker.py (idle-visible / picker shapes / dialog scrolled-out cases, plus test_codex_selection_dialog_detected_despite_empty_patterns and test_codex_idle_prompt_is_not_a_dialog) and by TestClaudeProvider::test_claude_idle_prompt_* in test_provider_behaviors.py (sandwich + single-HR-box detection, single-rule-in-prose rejection, picker shapes, short-inline-rule rejection, picker-focused-row rejection).

The running→idle cursor+silence flip must also respect is_idle_prompt_visible (don't blank an interactive UI out from under the arrow gate). The screen_has_active_dialog() arrow gate above only protects ↑/↓ while the UI is still in pyte's buffer. The running→idle cursor+silence fallback in get_state (~5 s of output silence with the cursor visible) would _reset_screen() on its way to IDLE, and it only side-stepped that for a dialog whose footer matched the strict is_dialog_certain form (which promotes to NEEDS_PERMISSION instead). That left a gap — confirmed live against real claude 2.1.162 with Leap's real tracker wired to the PTY: any RUNNING-state interactive UI whose footer is not Enter to select + Esc to cancel fell through after ~5 s of user deliberation to RUNNING → IDLE + _reset_screen(). Real cases that miss the strict footer: every slash-command picker (/model → Enter to set as default · s to use this session only · Esc to cancel; /resume → … Esc to cancel; both lack Enter to select, and with the ❯ cursor on a later option the ❯1. numbered-menu fallback misses too), plus alternate/older Claude dialog footers that genuinely exist in the binary (Esc to close on the tabbed multi-question / /agents view, Enter to approve … Esc to cancel, the multi-select Space to toggle, Enter to confirm …). Once the screen was blanked, screen_has_active_dialog() read "no dialog", so ↑/↓ were stolen for history recall (the "arrows get stuck in a picker / multi-choice question after a few seconds" report), and the false IDLE let the auto-sender flush a queued message straight into the open UI. The fix: the cursor+silence running→idle block stays RUNNING (no reset) when both not provider.is_idle_prompt_visible(filled) and (provider.has_selection_cursor(filled) or provider.has_interactive_footer(filled)), placed after the transcript_says_running() / interrupt / post-answer-grace guards, so a genuinely silent in-flight tool is still held by the transcript guard and only the final would-be idle flip is intercepted. The box-absent signal alone is too broad — plain response text (a numbered list, a long body ending in > ) also lacks the idle box yet must still idle, so a not is_idle_prompt_visible-only guard wrongly held it RUNNING (caught by test_dialog_false_positives). A real picker/dialog additionally shows either a ❯/› selection cursor on a focused option (has_selection_cursor scans the last _IDLE_TAIL_WINDOW rows) or, for cursor-less UIs like the /agents tabbed view, a nav/dismiss footer on the bottom row (has_interactive_footer matches distinctive markers — to navigate / Esc to close / Esc to cancel / Space to toggle / Enter to confirm — on the last non-blank row only, so prose mentioning Enter to select mid-sentence with a > prompt last doesn't match); plain response text has neither. (When the idle box IS present, is_idle_prompt_visible is True so the guard never reaches the cursor check — the genuinely-idle prompt's own ❯ input row can't trip it. This is exactly why box detection must cover single-HR builds: an earlier two-HR-sandwich requirement returned False on builds that render only a top rule, so after a no-Stop-hook idle — e.g. the /cost slash command, which fires no Stop hook so the state falls to this cursor+silence path — the guard reached the cursor check, matched the prompt's own ❯, and wedged the session in RUNNING forever. The detector now requires only the top-HR→❯ pairing.) Naturally scoped: other providers' is_idle_prompt_visible defaults True (and has_selection_cursor / has_interactive_footer default False) → no-op; the < _IDLE_DETECT_MIN_ROWS short-screen shortcut still returns "idle visible" so /clear-style sparse screens idle exactly as before. This generalizes protection beyond the strict footer — AskUserQuestion itself happens to render Esc to cancel in current Claude (so it promotes to NEEDS_PERMISSION and never needed this guard), but the guard covers every non-matching footer (pickers, Esc to close, future pickers) without enumerating them. Note the cursor-hidden edge: a picker that hides its cursor skips the cursor-gated 5 s block and falls to the 60 s safety net (rare for input UIs — Claude pickers keep the cursor visible — and the 60 s net is deliberately left ungated so the hung-silent-tool escape hatch still fires). Pinned by TestInteractiveUiKeepsRunningOnSilence in test_state_tracker.py (picker stays running, arrows stay navigable, genuine idle box still idles, plain text without a cursor idles, footer-only dialog without a cursor stays running) and by TestDialogFalsePositives (numbered list / scrolled-out phrases idle, not held RUNNING).

Answering a dialog must NOT _reset_screen() (on_input, gated on from_prompt). A multi-question AskUserQuestion renders as one tabbed dialog; answering one question (Enter from a PROMPT state) advances to the next question via an Ink incremental repaint that never re-emits the (unchanged) footer. If on_input resets the pyte screen on that answer, the footer is wiped and — for the ~5 s until Claude's next full re-render — the live screen has no dialog footer. That single desync drove two bugs, both confirmed against a live session log: (1) the cursor+silence check in get_state reads "no dialog" (is_dialog_certain False) and flips RUNNING → idle, falsely marking the still-pending question as done and letting the auto-sender dispatch a queued message INTO the dialog; (2) the ↑/↓ input filter's screen_has_active_dialog() likewise reads "no dialog" and steals the arrows for history recall (the "arrows dead on the 2nd question, but typing the number works" report). The fix: on_input skips _reset_screen() when the Enter answers a PROMPT state (needs_permission/needs_input) — keeping pyte truthful so the footer survives the incremental repaint and both the promotion (→ needs_permission) and the arrow check stay correct. This mirrors the running→needs_permission promotion path, which already skips the reset for the same reason. IDLE (a fresh prompt) and INTERRUPTED (an interrupt reply) still reset — there the prior screen is stale scrollback with nothing rendered incrementally on top. Pinned by TestDialogAnswerKeepsScreen in test_state_tracker.py (footer-preserved-so-navigable, no-false-idle, and the IDLE/INTERRUPTED still-reset guards).

Holding a hookless dialog at NEEDS_PERMISSION (no Permission↔Idle oscillation). Because AskUserQuestion writes no permission signal, the rendered footer on screen is the only reliable evidence it's still pending — a first-action question is never even written to the transcript while it waits. Two rules keep the promoted state from flickering back to Idle: (1) the cursor+silence running→needs_permission promotion does not _reset_screen() (matching the _handle_idle_output proactive promotion) — resetting desyncs pyte from Ink, which then only partially repaints and never restores the footer, so the waiting→idle dismissal checks would falsely read "dialog gone" and demote; (2) the 60 s stuck-waiting safety timeout keeps the waiting state while has_dialog_indicator still matches the live screen, scoped to PROMPT_STATES so a stuck INTERRUPTED still recovers. Pinned by test_incremental_repaint_after_promotion_keeps_dialog (faithful Ink-style incremental-repaint repro: demotes to idle pre-fix, holds post-fix) plus the test_*_dialog*on_screen cases in TestSafetyTimeouts.

Post-answer resume grace (no false idle into the first-token gap). Answering a mid-turn dialog (Enter from a PROMPT state) moves NEEDS_PERMISSION/NEEDS_INPUT → RUNNING, but Claude then resumes the same turn — AskUserQuestion is excluded from hook auto-approve, so in ALWAYS mode it's the one dialog answered by hand, and its answer is mid-turn, not end-of-turn. The model's first post-answer token can lag several seconds, while the dialog-dismissal render emits a tiny output burst within ~40 ms of the Enter. That burst moves _last_output_time past _running_since, opening the max(_last_output_time, _running_since) rebase gate, so the 5 s cursor+silence running→idle fallback then misfires on the first-token silence and the auto-sender flushes a queued message INTO the live turn (confirmed in a real state_logs capture: Enter…→running → 5.1 s silence → running→idle (cursor visible + output silent 5.1s) → ON_SEND 4 ms later → idle→needs_permission as Claude kept working). All three would-be guards are blind to this exact shape: the running-indicator only matches Compactingconversation; transcript_says_running() returns '' because the only assistant entry is the dialog's tool_use at ts <= _running_since (the answer bumped _running_since past it, tripping the if ts <= since: return '' guard in _classify_transcript_tail); and is_dialog_certain is False once the footer is cleared. The fix: on_input sets _awaiting_resume_after_prompt = from_prompt, and the shared helper _post_answer_grace_holds(silence_ref) gates both heuristic idle paths while that flag is set — capped at the safety-silence timeout (provider.silence_timeout or SAFETY_SILENCE_TIMEOUT=60 s), not an unconditional return, since these blocks run before the 60 s safety net and an early return would starve it (a genuinely hung post-answer turn must still recover). The two paths: (1) running→idle cursor+silence stays RUNNING; (2) waiting→idle cursor+silence — reachable because the running→idle block can re-promote RUNNING→NEEDS_PERMISSION off the still-on-screen answered footer (its grace check sits after that promotion), landing here when the footer finally clears — routes to RUNNING instead of idling (it must route, not just suppress: the waiting→idle signal path needs _user_responded, which the answer cleared, so staying NEEDS_PERMISSION could strand a real Stop-hook idle). The flag is cleared on every IDLE (top of get_state) and on on_send. The grace is free for all four providers because each writes an idle signal on turn end (Claude/Codex/Cursor Stop, Gemini AfterAgent) and the signal=idle path idles without consulting the flag — so a real end still idles promptly; only the unreliable heuristics are suppressed. Codex never reaches either branch (cursor_hidden_while_idle=True). INTERRUPTED never arms the flag (from_prompt is PROMPT_STATES-only), so interrupt-reply handling is untouched. Pinned by test_enter_from_waiting_stays_running_through_first_token_gap, test_post_answer_grace_still_idles_via_safety_timeout, test_on_send_clears_post_answer_grace, test_post_answer_grace_clears_on_idle, test_post_answer_stale_footer_repromote_routes_to_running (the waiting→idle secondary path), and test_waiting_self_dismiss_still_idles_without_an_answer (no-regression: flag-unset self-dismiss still idles) in TestSafetyTimeouts.

Composing guard (no false idle while the user types a prompt). The heuristic idle paths also misfire when the user composes the next prompt into a still-RUNNING session (type-ahead during the model's thinking / first-token latency): the cursor is visible, the typed keystrokes aren't echoed while the CLI is busy (so they don't bump _last_output_time), and a pause flips running→idle — a false "finished" notification, and the auto-sender could dispatch a queued message INTO the half-typed prompt. Confirmed in real state_logs: ON_INPUT state=CLIState.RUNNING data=b"f 'NotesCmdContext' " len=768 followed ~2 s later by running→idle (cursor visible + output silent). The fix: get_state(pty_alive, has_pending_input=False) takes a composing flag — the server passes bool(self._terminal_input_buf) or self._queue_capture_mode (unsubmitted text in the input box, or a ^^ queued message being composed) at both call sites. When set, the cursor+silence running→idle flip and the safety-silence timeout stay RUNNING (placed after the transcript / interrupt / post-answer-grace guards, mirroring the grace above, so interrupts and needs_permission promotions still fire). The authoritative paths are not gated — the signal=idle hook path and Codex transcript-completion still idle a genuinely-finished turn even while the user types — so no session can get stuck RUNNING; for hookless idles the gate releases the instant the box empties (_terminal_input_buf clears on Enter/Ctrl+C, shrinks on backspace/Ctrl+U). Because the gate only suppresses the heuristic fallbacks, a stale buffer can at worst delay an idle by one poll, never wedge it. is_ready (the auto-sender's readiness convenience) forwards the same flag so dispatch is gated identically — and the production auto-sender is already gated because it consumes the gated current_state via is_ready_for_state. Other consumers see it too: the monitor's idle notification reads the gated cli_state from the status response, and Slack's output_capture.on_state_change uses the gated current_state. Pinned by test_cursor_silence_idle_held_while_composing, test_safety_timeout_idle_held_while_composing, test_hook_idle_not_gated_by_composing, and test_is_ready_false_while_composing in TestSafetyTimeouts.

Heuristic-hold caps (a screen-misread can't wedge RUNNING forever) + the labeled idle-box border. The two RUNNING-holds that are screen-content heuristics — the post-answer resume grace and the picker/dialog interactive-UI guard — can mis-read the screen with no reliable user-recoverable release, so each is capped at the safety-silence timeout via the shared _heuristic_hold_cap() (provider.silence_timeout or SAFETY_SILENCE_TIMEOUT); past the cap they fall through to the idle fallback, so an is_idle_prompt_visible false-negative can no longer wedge RUNNING. The composing guard above is deliberately NOT capped — it's released the instant the input box empties (submit / Ctrl+C / clear), so it's a user-recoverable "still typing" hold, not a wedge, and capping it would re-fire the false "finished" notification it exists to prevent (this is why both composing guards — cursor+silence and safety-silence — stay bare if has_pending_input). The most common box false-negative is also fixed at the source: Claude can draw a short text badge into the idle box's top rule (e.g. ────psakdin-case-law-source────, an active-skill / model / plan-mode chip), which the strict pure-─ _is_prompt_box_hr rejected → box undetected → the interactive-UI guard matched the prompt's own ❯ and wedged (confirmed live: nushi.log held RUNNING ~58 min). _is_prompt_box_hr now accepts a rule carrying a short label, gated by an allowlist (_is_hr_label_safe: letters/digits/-_./() only), so a rule embedding graphics — table borders, block-element progress bars ████░░░░, sliders, geometric shapes, percent bars — is still rejected; the failure mode is asymmetric (a mis-classified border falls back to the cap, never a false-idle). Pinned by test_interactive_ui_guard_is_capped_and_recovers and test_claude_idle_prompt_visible_with_label_in_border.

Auto-sender dispatch safety (server.py _auto_sender_loop). Beyond gating on the composing-aware current_state, two direct guards protect the dispatch: (1) it never sends while the input buffer is non-empty (_terminal_input_buf / _queue_capture_mode), independent of state, so a half-typed prompt is never clobbered even past a hold cap; (2) a 2-consecutive-ready-poll debounce, so a single-poll false-idle can't flush a queued message mid-turn. A ^^ + Enter force-dispatch (_capture_force_dispatch) bypasses both.

Codex INTERRUPTED recovery (safety-waiting-timeout). The timeout's signal_state == current keep is scoped to PROMPT_STATES. INTERRUPTED writes its own interrupted signal, so an unscoped keep blocked the demotion forever for cursor_hidden_while_idle providers (Codex has no cursor+silence INTERRUPTED self-dismissal) — the "interrupt sticks in INTERRUPTED" failure mode. A stale INTERRUPTED now recovers after the timeout. Pinned by test_codex_interrupted_recovers_via_safety_timeout.

Critically: the auto_approve state does NOT touch the signal file. It's a pure hook decision; Leap's state machine stays RUNNING throughout, as if no permission had ever been needed.

This hook fires for subagent (Task tool) tool calls too, which the older Notification path could silently miss — Claude's Stop hook does not fire for subagents, so an entire multi-agent turn stayed RUNNING with _last_running_snapshot == [], and the Late Notification guard had no fallback content to verify the dialog against. The PermissionRequest hook sidesteps every TUI race because no dialog is ever rendered.

Fallback: TUI menu auto-approve (_try_auto_approve in server.py). Still wired up for two scenarios:

Older Claude versions that don't support PermissionRequest — the new hook entry is silently ignored by them, and approval falls back to detecting ❯ 1. Yes on the rendered menu and typing 1\r.
Defense-in-depth race — if PermissionRequest somehow doesn't fire (e.g. a future Claude bug, or an unrecognized matcher edge case), Notification(permission_prompt) still fires, the state tracker transitions to NEEDS_PERMISSION, and _try_auto_approve picks up the dialog.

The _try_auto_approve path itself was strengthened: the Late Notification guard at state_tracker.py:get_state formerly rejected RUNNING→prompt signals when no dialog patterns were on screen AND _last_running_snapshot was empty — that's exactly the multi-agent subagent shape. The guard now distinguishes the post-Enter stale signal (empty screen + empty snapshot, the freshly-answered-via-Enter signature) from a fresh subagent signal (screen has accumulated subagent output, snapshot empty because no idle transition during the turn). Only the empty-and-empty pair is treated as stale; anything else lets the signal through.

What auto-approve does NOT auto-handle. MCP Elicitation (Notification matcher elicitation_dialog) is not auto-approved — these are free-form input requests where Leap can't guess what to type. They surface to the user via NEEDS_INPUT. Permission-to-USE the elicitation tool is auto-approved (it's a tool call), but the resulting question dialog stays user-facing — that's the right asymmetry.

Other CLIs (Codex, Cursor, Gemini). The bug above is Claude-specific because Claude is the only one with subagents. Codex/Cursor have no permission hook at all (state tracker uses TUI detection); Gemini uses Notification(ToolPermission) but has no subagent concept. None of them get a PermissionRequest hook — the test test_other_providers_do_not_install_permission_request pins this.

name	auto-approve-architecture
description	Internals of Leap's Claude auto-approve flow and the CLI state machine (CLIStateTracker)- the PermissionRequest hook, the AskUserQuestion exclusion, per-session auto_send_mode isolation and pin-file robustness, up/down arrow handling during dialogs and slash-command pickers, the on-input no-reset rule, and the TUI-menu fallback. Use this when modifying auto-approve behavior, the state tracker, hook handling in leap-hook-process.py, or Claude permission/dialog detection.
user-invocable	false

auto-approve-architecture

More from this repository

More from this repository

Auto-Approve Architecture (Claude)

Auto-Approve Architecture (Claude)