en un clic
megaplan-observe
// Observe an in-flight megaplan — introspect state, trace events, diagnose blockages, detect drift. Companion to megaplan-decision. Use during and after a run, not before.
// Observe an in-flight megaplan — introspect state, trace events, diagnose blockages, detect drift. Companion to megaplan-decision. Use during and after a run, not before.
| name | megaplan-observe |
| description | Observe an in-flight megaplan — introspect state, trace events, diagnose blockages, detect drift. Companion to megaplan-decision. Use during and after a run, not before. |
When a megaplan is running — or stuck — megaplan introspect, megaplan trace, and megaplan doctor tell you what it's doing, why, since when, and how to intervene. This skill covers the observation surface; megaplan-decision covers profile/robustness/depth selection before a run.
megaplan introspect --plan X returns a single structured JSON payload. Four fields in it are the killers — each eliminates a failure mode that previously cost real sessions hours of confusion.
now_utc — the anti-stale-timestamp anchorEvery timestamp in the introspect payload is relative. now_utc is the wall clock at the moment the payload was generated. Never infer recency from JSON timestamps without cross-checking against now_utc from the same payload. A last_artifact_at of 14:25:11Z might look recent — but if now_utc is 15:45:00Z, the artifact is 80 minutes old and the phase is likely stuck.
Rule: when reading introspect output, compute every duration as now_utc - timestamp, not as "when I last looked."
active_phase.liveness — the go/no-go enumOne of four values; each dictates a different response:
| Liveness | Meaning | Action |
|---|---|---|
progressing | Events <60s old OR in-flight LLM call exists | Wait. The system is working. |
quiet | Last event 60-300s ago, no in-flight LLM | Watch. May be between checks; check again in 30s. |
stalled | Last event >300s ago AND no unmatched llm_call_start | Intervene. The phase has stopped. |
timeout-imminent | Phase age > 80% of phase_timeout | Decide now. Extend timeout, kill phase, or accept partial results. |
Critical rule: a phase with an unmatched llm_call_start (no matching llm_call_end yet) is NEVER classified as stalled, regardless of wall-clock age. The model is still producing — be patient.
block_details.recoverable_via — the only moves the state machine will acceptWhen state: blocked, this field enumerates the exact recovery actions the state machine will accept. Never try a recovery action that isn't in this list. The invalid_transition error exists precisely because callers guessed — recoverable_via replaces guessing with the canonical transition table.
The list is drawn from workflow_next() / infer_next_steps() — the single-source function in megaplan/_core/workflow.py that the override handler itself checks against. It is always consistent with what megaplan override will accept.
rubric_doc.drift — tooling/doc misalignment before it bitesIf rubric_doc.drift.missing_locally is non-empty, the megaplan-decision skill references profile names your binary doesn't expose. This is the exact failure mode that wastes hours: the skill says --profile thoughtful, the binary says Unknown profile 'thoughtful'. drift catches it before the invocation. Use a profile from profiles_available_locally whose recipe matches what the rubric describes, or pin the binary to a state that has the canonical names.
When something seems wrong, investigate in this order:
megaplan introspect --plan X — one call, full picture. Always start here. The four killer fields answer 90% of questions.megaplan trace --plan X --follow — if introspect says progressing but you want to watch. Stream events live; narrative format gives prose summaries of LLM calls.megaplan doctor --plan X — if introspect shows a flag or state you don't recognise. Diagnostic checks with remediation hints.state.json, events.ndjson, phase artifacts manually. If you need this, file a bug — introspect should cover it.The hierarchy is deliberate: each step is a thin reader over the same events.ndjson journal. Jumping to the filesystem before trying the surfaces misses the structured analysis those surfaces provide (liveness computation, drift detection, recoverable_via enumeration).
Each entry: the introspect signature, the recovery, and a worked example from a real session.
Signature: 4 of N checks complete, active_phase.last_artifact_rel > 15min, subprocess still has open network socket, active_phase.liveness: quiet.
Recovery:
last_artifact_rel < phase_timeout / 2: wait. The model may be producing a large check artifact.last_artifact_rel > phase_timeout / 2: check LLM heartbeat via trace --follow --format narrative. If heartbeats stopped, the LLM call may be wedged — kill the phase and resume.block_details.outstanding_flags — if the critique found flags it can't resolve, it may have looped without producing artifacts.Worked example:
Session:
prompt-registry-and-reminder-bundling-v5, critique phase. 4 of 5 checks completed, last check at 14:25:11Z. At 14:55:00Z,introspectshowsliveness: quiet,last_artifact_rel: 29m 49s ago. Subprocess PID 58800 still has an open TCP socket to Fireworks.trace --follow --format narrativeshows: "Token stream stopped 28m ago. 4,200 tokens emitted at 18 tok/s. Last token at 14:26:31Z — no tokens since." The model finished producing but hermes didn't close the call — likely a provider-side hang. Kill the phase,megaplan resumepicks up from where critique left off.
Signature: state: blocked, block_details.outstanding_flags non-empty, block_details.recoverable_via populated.
Recovery:
recoverable_via — the list is the exact set of override actions the state machine will accept.override force-proceed with a note explaining why.invalid_transition.Worked example:
Session: Plan went
blockedat gate with 2 outstanding flags (FLAG-V4-001: high severity invariant contradiction, FLAG-V4-002: medium severity missing coverage).recoverable_viashows:["fix brief and re-init (recommended)", "override add-note + override force-proceed", "override replan (requires state ∈ {critiqued, failed, finalized, gated})"]. FLAG-V4-001 is a genuine correctness issue — fix the brief to resolve the contradiction, re-init. FLAG-V4-002 is acceptable — the uncovered module is out of scope per the brief. Add a note documenting the scope decision, thenoverride force-proceed.
Signature: rubric_doc.drift.missing_locally non-empty. The decision skill references profiles the binary doesn't have.
Recovery:
profiles_available_locally whose tier/recipe matches what the rubric describes.megaplan doctor --repo before megaplan init to catch this proactively.Worked example:
Session:
megaplan init --profile thoughtful→Unknown profile 'thoughtful'.introspect(on a different plan) showsrubric_doc.drift.missing_locally: ["basic","led","thoughtful","super-premium"],profiles_available_locally: ["solo","directed","partnered","premium","apex"]. The binary is on branchsprint-a-basewhich renamed the canonical profiles to the new 5-tier scheme. The skill doc hasn't been updated yet. Use--profile partnered(the tier-3 equivalent) or switch to main where the old names still exist.
Four explicit, named rules derived from real failure modes. Violating any of these caused measurable confusion in prior sessions.
now_utc cross-checkstate.json timestamps are snapshot values — they're only as recent as the last write. now_utc is the actual wall clock. Always compute recency against now_utc. A last_step.timestamp of 5 minutes ago might mean the phase is stuck, or it might mean state.json hasn't been flushed — now_utc tells you which.
invalid_transition — read recoverable_via firstThe state machine rejects invalid transitions with a specific error. Retrying the same override without checking recoverable_via is the definition of a loop. recoverable_via is computed from the same transition table the handler enforces — it is always correct.
If megaplan is installed via pip install -e ., any branch switch or uncommitted change in the source tree changes behavior immediately. megaplan doctor --repo surfaces this. Stashing, checking out, or pulling without explicit user consent can silently remove the profile a caller is about to invoke.
livenessactive_phase.liveness is the single source of truth for whether a phase is progressing. progressing means there's recent activity or an in-flight LLM call — wait. quiet means watch. Only stalled or timeout-imminent warrant intervention. Guessing based on wall-clock intuition leads to killing phases that were about to finish.
User asks whether a long-running plan is still progressing.
megaplan introspect --plan prompt-registry-and-reminder-bundling-v5
Read active_phase.liveness. If progressing: "Yes — critique is still running. Last artifact 3 minutes ago (critique_check_scope.json). Model claude:opus-4.7 is actively producing tokens." If quiet: "Critique hasn't produced an artifact in 8 minutes, but the LLM call is still in-flight. Watching." If stalled: "Critique appears to have stopped — no events in 12 minutes and no in-flight LLM call. Check recoverable_via."
megaplan introspect --plan my-sprint
Read block_details.recoverable_via. Execute the first applicable option. Verify the state transition succeeded with another introspect call. If the override was force-proceed, confirm state is now past blocked. If the override was replan, confirm the plan re-entered the planning phase.
megaplan trace --plan my-sprint --format narrative --since 10m
The narrative format groups consecutive LLM calls into prose summaries. Look for which model is being called, how frequently, and at what token volume. Cross-reference against the tier's expected cost profile from megaplan-decision. If the model is correct but call frequency is high, the plan may be looping — check phase_retry events. If the model is wrong (e.g., premium model on a solo-tier phase), check for an unintended override set-profile.
megaplan doctor --repo
The repo-level check catches: rubric/binary drift (skill references profiles the current binary doesn't have), editable-install + dirty working tree (uncommitted changes affecting behavior), skill files out of sync with installed copies. Fix the specific WARN/ERROR lines before running any plan.
megaplan introspect --plan X
The full payload has the answer. Start at the top:
now_utc — establish the wall-clock anchor.active_phase.liveness — is it progressing?active_phase.last_artifact_rel — how long since the last artifact?block_details — is it blocked? What flags? What recoveries?rubric_doc.drift — is there a profile-name mismatch?binary_git — is the binary on the expected branch? Dirty?timeline — scan the phase-by-phase breakdown for anomalies.If nothing jumps out, escalate to trace --follow to watch live, then doctor --plan for diagnostic checks.
Pick the right megaplan profile, thinking-strength tier, and robustness level for the work in front of you — for both Codex and Claude harnesses. Consult before invoking megaplan.
Three-round adversarial critique of epic drafts (high / mid / low abstraction) with revision after each round. Produces a chain-ready revised epic.
AI agent harness for coordinating Claude and GPT to make and execute extremely robust plans.
Methodology for running multi-profile LLM bake-offs via megaplan and presenting fair, blind-assessed comparisons. Cost/quality discipline, prompt hygiene, pre-merge gates, and reporting patterns. Use when the user says "bakeoff", "bake off", "megaplan bakeoff", or asks to compare profile mixes head-to-head.
Run megaplan plans and chains inside a provider-managed container (today, Railway) with a persistent workspace volume. Use when the run needs to outlast a local terminal session, span multiple repos, or share a long-lived dev box across concurrent chains. Covers `cloud.yaml` fields, `extra_repos[]` + `chain_session` multi-tenancy, the operator loop, and the gotchas that wedge fresh runs.
Run an epic — a chain of sprint-sized megaplans driven sequentially via `megaplan chain`. Use when the work is bigger than ~2 weeks and needs to be decomposed into multiple plans with state, ordering, and failure semantics handled by the harness.