一键导入
experiment-running
Execute the plan by dispatching fresh subagents per task, monitoring status, and collecting results
用 Codex 或 Claude 帮你安装 复制这段 Prompt,粘贴到 Codex、Claude 或其他助手里,让它检查 Skill 页面并帮你完成安装。
菜单
Execute the plan by dispatching fresh subagents per task, monitoring status, and collecting results
用 Codex 或 Claude 帮你安装 复制这段 Prompt,粘贴到 Codex、Claude 或其他助手里,让它检查 Skill 页面并帮你完成安装。
基于 SOC 职业分类
Strategy: Attack an isomorphism claim by demanding an explicit structure-preserving map and trying to break it. Targets any multi-language claim of the form 'X ≅ Y ≅ … across N mathematical languages'. Forces the claim to either earn the word 'isomorphism' or be demoted to 'analogy'. Methods: category theory (functor/natural-iso criteria), model theory, Lakatos monster-barring.
Strategy: Dialectic engine retuned for truth-seeking, not survival. A defender steelmans a claim into its MOST falsifiable form, a critic attacks to refute it, a judge classifies the exchange into BROKEN/CORROBORATED/UNFALSIFIABLE — the judge does NOT pick a winner or score persuasiveness. Methods: Irving debate (repurposed), Toulmin argumentation, Mayo severe testing.
Strategy: Run BEFORE building any validator (sandbox/simulation/benchmark). Builds a non-circularity matrix of theory-claim × validator-assumption to detect when a validator would 'confirm' a theory only because it was built on the theory's own premises. A circular validator's PASS carries zero evidential weight. Methods: Cartwright nomological machines, Winsberg sanctioning-of-simulations, tautology detection.
Strategy: Attack a beautiful unified result on the suspicion that its beauty is the bug. Distinguishes EARNED simplicity (forbids/predicts/subsumes) from DECORATIVE simplicity (re-describes/relabels/accommodates). Directly serves the Occam aesthetic by making it a falsifiable bar, not a vibe. Methods: Sober parsimony-as-evidence, MDL, Meehl risky prediction, accommodation-vs-prediction.
Campaign: Truth-seeking adversarial validation for scientific research artifacts (NOT publication defense). Core question: Where have we fooled ourselves, and is each load-bearing claim even falsifiable? Win-condition is INVERTED from survival/resilience to active refutation. Methods: Popper falsificationism, Lakatos Proofs and Refutations, Mayo severe testing, Platt strong inference.
Strategy: Attack the evidential weight of an 'independent convergence' claim. When N reasoning paths all reach the same conclusion, the confidence boost is real only if the paths were actually independent. Measures shared-prior / shared-blindspot contamination and corrects the over-counted confidence. Methods: Bayesian agreement-as-evidence, correlated-error analysis, jury theorem assumptions.
| name | experiment-running |
| description | Execute the plan by dispatching fresh subagents per task, monitoring status, and collecting results |
| version | 1.0.0 |
| category | experiment-execution |
| type | strategy |
| sops | ["implementer-dispatch","execution-monitoring","result-collection"] |
| tactics | ["subagent-execution-loop","checkpoint-and-recover"] |
| dependencies | {"sops":["execution-monitoring","implementer-dispatch","result-collection","ponytail:ponytail","ponytail:ponytail-debt","superpowers:executing-plans","superpowers:finishing-a-development-branch","superpowers:subagent-driven-development","superpowers:using-git-worktrees","superpowers:verification-before-completion"],"tactics":["checkpoint-and-recover","subagent-execution-loop"]} |
Key Question: How to execute?
实现的执行直接交给 superpowers 现成链路,不再自写 fresh-subagent / 三段 review。 plan(上游 plan-writing 产出)就绪后,本策略是一串决策节点:
Skill load superpowers:using-git-worktrees —— 建隔离工作区 + 跑 baseline 测试。Skill load ponytail:ponytail —— 进入写代码前开启精简反射(边写边 lean)。Skill load superpowers:verification-before-completion —— claim 完成前先跑证明命令。Skill load ponytail:ponytail-debt —— 收尾前收集 ponytail: 欠债标记。Skill load superpowers:finishing-a-development-branch —— 验证测试 → merge/PR/branch。DARE 原生的 checkpoint-and-recover(高风险操作前存档)与 subagent-execution-loop (执行循环细节)作为 tactic 仍在编排内保留。
[plan from plan-writing]
→ superpowers:using-git-worktrees (隔离区 + baseline)
→ ponytail:ponytail (精简反射开启)
→ superpowers:executing-plans 或 superpowers:subagent-driven-development
→ superpowers:verification-before-completion (claim 前验证)
→ ponytail:ponytail-debt (收欠债)
→ superpowers:finishing-a-development-branch (收尾)
| Step | Max Budget | Output |
|---|---|---|
| Per-task execution | 50% of execution budget / N tasks | Task result |
| Monitoring overhead | 5% of execution budget | Status log |
| Retry budget | 10% of execution budget | Unblocked tasks |
Optional, no fixed order; the final leaf is always a sop.
| Tactic | When to use |
|---|---|
| checkpoint-and-recover | Checkpoint state before risky operations, detect anomalies, and recover gracefully |
| subagent-execution-loop | Orchestrate task execution via fresh subagents with dispatch, monitoring, and result collection |
Optional, no fixed order; the final leaf is always a sop.
| SOP | When to use |
|---|---|
| execution-monitoring | Monitor execution progress, detect anomalies, and report status |
| implementer-dispatch | Dispatch execution subagent — select model by complexity, construct prompt with full task context |
| ponytail:ponytail | Lazy-senior reflex: simplest thing that holds; mark every deliberate shortcut |
| ponytail:ponytail-debt | Harvest ponytail debt markers before finishing |
| result-collection | Collect experiment outputs — metrics, logs, artifacts — into structured result set |
| superpowers:executing-plans | Execute the plan task-by-task in the current session with checkpoints |
| superpowers:finishing-a-development-branch | Verify tests -> merge / PR / branch cleanup |
| superpowers:subagent-driven-development | Execute the plan via a fresh subagent per task with two-stage review |
| superpowers:using-git-worktrees | Create an isolated worktree + run baseline tests before implementing |
| superpowers:verification-before-completion | Run the proving command and confirm output before claiming done |