ワンクリックで
experiment-running
Execute the plan by dispatching fresh subagents per task, monitoring status, and collecting results
Codex または Claude でインストール この Prompt をコピーして Codex、Claude、または他のアシスタントに貼り付けると、Skill ページを確認してインストールできます。
メニュー
Execute the plan by dispatching fresh subagents per task, monitoring status, and collecting results
Codex または Claude でインストール この Prompt をコピーして Codex、Claude、または他のアシスタントに貼り付けると、Skill ページを確認してインストールできます。
SOC 職業分類に基づく
Strategy: Attack an isomorphism claim by demanding an explicit structure-preserving map and trying to break it. Targets any multi-language claim of the form 'X ≅ Y ≅ … across N mathematical languages'. Forces the claim to either earn the word 'isomorphism' or be demoted to 'analogy'. Methods: category theory (functor/natural-iso criteria), model theory, Lakatos monster-barring.
Strategy: Dialectic engine retuned for truth-seeking, not survival. A defender steelmans a claim into its MOST falsifiable form, a critic attacks to refute it, a judge classifies the exchange into BROKEN/CORROBORATED/UNFALSIFIABLE — the judge does NOT pick a winner or score persuasiveness. Methods: Irving debate (repurposed), Toulmin argumentation, Mayo severe testing.
Strategy: Run BEFORE building any validator (sandbox/simulation/benchmark). Builds a non-circularity matrix of theory-claim × validator-assumption to detect when a validator would 'confirm' a theory only because it was built on the theory's own premises. A circular validator's PASS carries zero evidential weight. Methods: Cartwright nomological machines, Winsberg sanctioning-of-simulations, tautology detection.
Strategy: Attack a beautiful unified result on the suspicion that its beauty is the bug. Distinguishes EARNED simplicity (forbids/predicts/subsumes) from DECORATIVE simplicity (re-describes/relabels/accommodates). Directly serves the Occam aesthetic by making it a falsifiable bar, not a vibe. Methods: Sober parsimony-as-evidence, MDL, Meehl risky prediction, accommodation-vs-prediction.
Campaign: Truth-seeking adversarial validation for scientific research artifacts (NOT publication defense). Core question: Where have we fooled ourselves, and is each load-bearing claim even falsifiable? Win-condition is INVERTED from survival/resilience to active refutation. Methods: Popper falsificationism, Lakatos Proofs and Refutations, Mayo severe testing, Platt strong inference.
Strategy: Attack the evidential weight of an 'independent convergence' claim. When N reasoning paths all reach the same conclusion, the confidence boost is real only if the paths were actually independent. Measures shared-prior / shared-blindspot contamination and corrects the over-counted confidence. Methods: Bayesian agreement-as-evidence, correlated-error analysis, jury theorem assumptions.
| name | experiment-running |
| description | Execute the plan by dispatching fresh subagents per task, monitoring status, and collecting results |
| version | 1.0.0 |
| category | experiment-execution |
| type | strategy |
| sops | ["implementer-dispatch","execution-monitoring","result-collection"] |
| tactics | ["subagent-execution-loop","checkpoint-and-recover"] |
| dependencies | {"sops":["execution-monitoring","implementer-dispatch","result-collection","ponytail:ponytail","ponytail:ponytail-debt","superpowers:executing-plans","superpowers:finishing-a-development-branch","superpowers:subagent-driven-development","superpowers:using-git-worktrees","superpowers:verification-before-completion"],"tactics":["checkpoint-and-recover","subagent-execution-loop"]} |
Key Question: How to execute?
实现的执行直接交给 superpowers 现成链路,不再自写 fresh-subagent / 三段 review。 plan(上游 plan-writing 产出)就绪后,本策略是一串决策节点:
Skill load superpowers:using-git-worktrees —— 建隔离工作区 + 跑 baseline 测试。Skill load ponytail:ponytail —— 进入写代码前开启精简反射(边写边 lean)。Skill load superpowers:verification-before-completion —— claim 完成前先跑证明命令。Skill load ponytail:ponytail-debt —— 收尾前收集 ponytail: 欠债标记。Skill load superpowers:finishing-a-development-branch —— 验证测试 → merge/PR/branch。DARE 原生的 checkpoint-and-recover(高风险操作前存档)与 subagent-execution-loop (执行循环细节)作为 tactic 仍在编排内保留。
[plan from plan-writing]
→ superpowers:using-git-worktrees (隔离区 + baseline)
→ ponytail:ponytail (精简反射开启)
→ superpowers:executing-plans 或 superpowers:subagent-driven-development
→ superpowers:verification-before-completion (claim 前验证)
→ ponytail:ponytail-debt (收欠债)
→ superpowers:finishing-a-development-branch (收尾)
| Step | Max Budget | Output |
|---|---|---|
| Per-task execution | 50% of execution budget / N tasks | Task result |
| Monitoring overhead | 5% of execution budget | Status log |
| Retry budget | 10% of execution budget | Unblocked tasks |
Optional, no fixed order; the final leaf is always a sop.
| Tactic | When to use |
|---|---|
| checkpoint-and-recover | Checkpoint state before risky operations, detect anomalies, and recover gracefully |
| subagent-execution-loop | Orchestrate task execution via fresh subagents with dispatch, monitoring, and result collection |
Optional, no fixed order; the final leaf is always a sop.
| SOP | When to use |
|---|---|
| execution-monitoring | Monitor execution progress, detect anomalies, and report status |
| implementer-dispatch | Dispatch execution subagent — select model by complexity, construct prompt with full task context |
| ponytail:ponytail | Lazy-senior reflex: simplest thing that holds; mark every deliberate shortcut |
| ponytail:ponytail-debt | Harvest ponytail debt markers before finishing |
| result-collection | Collect experiment outputs — metrics, logs, artifacts — into structured result set |
| superpowers:executing-plans | Execute the plan task-by-task in the current session with checkpoints |
| superpowers:finishing-a-development-branch | Verify tests -> merge / PR / branch cleanup |
| superpowers:subagent-driven-development | Execute the plan via a fresh subagent per task with two-stage review |
| superpowers:using-git-worktrees | Create an isolated worktree + run baseline tests before implementing |
| superpowers:verification-before-completion | Run the proving command and confirm output before claiming done |