| name | muggle-do |
| description | Unified Muggle AI workflow entry point. Use when user types muggle do or asks for autonomous implementation to PR. Also handles the `address-reviews` directive (dispatched by the muggle-pr-followup watcher when new submitted reviews land on a PR). |
Muggle Test Do
Telemetry first step: see ../_shared/telemetry-emit.md. Use skillName: "muggle-do".
Runs an autonomous dev cycle from requirements to PR. Fire and review: user answers one pre-flight questionnaire, then walks away. The muggle-pr-followup watcher invokes /muggle-do again with the address-reviews directive when new reviews land.
Forward pipeline (fresh feature)
Stage 7 dispatches one watcher per opened PR as its last action.
Execution protocol (non-negotiable)
The pipeline table lists pointers, not summaries. Open each stage's file and execute from it — running a stage off its one-line row here is how tests, E2E, and session state get silently skipped. If you have not read a stage's file this run, you have not run that stage.
Bootstrap before any code, in order:
- Emit telemetry —
../_shared/telemetry-emit.md, skillName: "muggle-do".
- Create
~/.muggle-ai/muggle-do/sessions/<slug>/ with state.md + iterations/001.md (pre-flight owns this; do it even when running unattended).
TodoWrite one item per stage 1–8 — these stages are the checklist; never swap in your own decomposition.
Per stage: read the file → execute it → append a marker to iterations/<NNN>.md citing the evidence that file requires (jest exit code, E2E verdict + runId, screenshot path). A stage is done only when its evidence is written, never on recollection.
"Autonomous" / "without my intervention" collapses exactly one thing
Best-effort the Stage-1 questionnaire and don't ask. It does not license skipping telemetry, session artifacts, requirements, unit tests, E2E (autoE2ETest defaults to always), browser verification, the gate below, or the watcher hand-off. Run the whole pipeline silently — never a shortcut.
Definition of Done — gate before Stage 7
Do not create or update a PR until each line holds, or is waived by a one-line reason written into state.md (silence is not a waiver):
requirements.md written (forward runs)
- Build clean — typecheck + lint on changed files
- New/changed logic carries unit tests (authored in Stage 3; Stage 5 only runs the suite)
- Unit suite run, PASS recorded
- E2E verdict recorded with
runId per autoE2ETest — or [E2E FAILING] / SKIPPED + reason
- UI changes verified in a real browser with evidence (screenshot path or muggle
runId); curl + grep is not verification
Opening a PR with an unchecked, unwaived line is a cycle failure.
Address-reviews flow
When invoked with the directive (PR URL + slug + review ids), routes to ../do/address-reviews.md. Shares stages 3–6 + walkthrough with the forward pipeline; skips pre-flight, requirements, and PR creation. See the orchestrator for the cycle's exact step order, classification rules, and respawn logic.
Input routing
/muggle-do serves one interactive mode (the forward pipeline, from a fresh task) and four programmatic modes the watcher dispatches (address-reviews, fix-ci, resolve-conflicts, post-merge cleanup). Resolve $ARGUMENTS to a mode per ../do/input-routing.md before doing anything else.
Preferences
| Preference | Gate |
|---|
autoE2ETest | Stage 6 — run E2E every cycle (default always), or fold into pre-flight |
autoResolveConflicts | On rebase conflict — resolve autonomously behind a verify-or-rollback gate (opt-in), or abort + escalate (default never) |
autoRouteBuildToMuggleDo | Front-door guardrail — route build/implement/fix requests through this pipeline (build delegated to superpowers); fired by the UserPromptSubmit guardrail, default ask |
autoUseWorktree, autoRebase, autoResolveConflicts, autoCreatePR, autoCleanup fire from per-stage files.
Session model
~/.muggle-ai/muggle-do/sessions/<slug>/. Schemas: ../muggle-pr-followup/state-schemas.md.
| File | Owner |
|---|
state.md | Stage 1 or bootstrap |
iterations/<NNN>.md | Every stage |
requirements.md | Stage 2 (forward only) |
prs.json, last_seen.json, followup.log | Stage 7 / watcher / /muggle-do |
result.md | Stage 7 (seeded), terminal tick (finalized) |
Guardrails
- Stage 1 is the only user-facing forward stage. Stages 2–7 don't ask mid-cycle; blocker → pre-flight bug.
- Same stage failing 3× → escalate.
- 3 cycle iterations reach E2E with failures → ship with
[E2E FAILING].
- Address-reviews escalation (ambiguous or design-adjustment) does not block the watcher; user resolves on GitHub.