Automated UI verification loop: Marionette screenshot -> Figma reference -> pixel-diff -> difference list -> auto-fix iteration. Solves the false-positive verification problem flagged in insights reports (UI claimed fixed without actually being compared against the reference). Triggers (English): verify ui, ui compare, pixel diff, figma compare, automated ui verification, golden test fail. 自动 UI 验证:Marionette 截图 → Figma 参考图 → pixel-diff → 差异列表 → 自动修复循环。 触发关键字(中文):verify ui、ui 对比、pixel diff、figma 对比、自动验证
Structured visual QA verdict for screenshot-to-reference comparisons
Operational rules for driving Codex CLI from scripts: success-signal contract, diff-feeding semantics, worktree -C flag, and stdin vs argv. Triggers when invoking `codex exec` programmatically (not interactively) — script wrappers, ralph loops, cron pipelines, multi-CLI fan-out. Surfaces silent failure modes that exit 0 but produce no useful output. Triggers (English): codex exec, codex cli, codex review, codex rescue, codex fallthrough, agent script invocation, programmatic codex.
Use when implementation is complete, all tests pass, and you need to decide how to integrate the work - guides completion of development work by presenting structured options for merge, PR, or cleanup
Socratic deep interview with mathematical ambiguity gating before autonomous execution
Use when creating or developing, before writing code or implementation plans - refines rough ideas into fully-formed designs through collaborative questioning, alternative exploration, and incremental validation. Don't use during clear 'mechanical' processes
Use when partner provides a complete implementation plan to execute in controlled batches with review checkpoints - loads plan, reviews critically, executes tasks in batches, reports for review between batches
Use when completing tasks, implementing major features, or before merging to verify work meets requirements - dispatches code-reviewer subagent to review implementation against plan or requirements before proceeding