Run any Skill in Manus with one click

analysis-campaign

Use when a quest needs one or more follow-up runs such as ablations, robustness checks, error analysis, or failure analysis after a main experiment.

Run Skill in Manus

Overview

Use when a quest needs one or more follow-up runs such as ablations, robustness checks, error analysis, or failure analysis after a main experiment.

Install command

npx skills add https://github.com/ResearAI/DeepScientist --skill analysis-campaign

Copy and paste this command into Claude Code to install the skill

Source

ResearAI/DeepScientist

Stars2,915

Forks304

UpdatedApril 29, 2026 at 15:36

File Explorer

8 files

SKILL.md

readonly

Analysis Campaign

Use this skill when follow-up evidence is needed after a durable result. The goal is to answer a bounded, resource-aware evidence question, not to keep opening more slices just because they are imaginable.

Match signals

Use analysis-campaign when:

a durable main result already exists and follow-up evidence is needed
the quest needs ablations, robustness checks, sensitivity checks, failure analysis, error analysis, efficiency or cost checks, or limitation-boundary checks
writing, review, or rebuttal pressure exposed an evidence gap that should be answered by bounded follow-up slices

Do not use analysis-campaign when:

the quest still lacks a credible main run or accepted baseline and the proposed work depends on that missing reference
the next step is obviously another main experiment rather than follow-up evidence work
the proposed slice does not connect to a parent claim, parent result, paper gap, reviewer item, or route decision

One-sentence summary

Answer the smallest evidence question that changes, confirms, or blocks a parent claim, then stop when the next route is clear.

Control workflow

Lock the parent object, evidence question, comparison target, and stop condition. Make explicit what claim, failure mode, or route decision is actually being tested.
Audit the real execution envelope before designing the slice set. Make explicit the current device and runtime limits: available GPU or CPU class, memory, wall-clock budget, storage, concurrency, required dependencies, and any queue or service constraints that materially limit what can run now.
Choose the lightest analysis route and the smallest slice set that can answer the question within that envelope. Prefer slices with the highest soundness gain per unit of compute, time, or engineering effort. Run claim-critical slices first and mark infeasible slices explicitly instead of quietly keeping them in scope.
Keep slices isolated and comparable. Record exactly what changed, what stayed fixed, and whether apples-to-apples comparison still holds.
Record slice-level evidence before making any campaign-level claim. Every meaningful slice should leave a durable outcome and a claim update.
Aggregate only the decision-relevant findings and route the next step. End in continue, write, experiment, idea, decision, blocker, or stop.

Paper-facing analysis quantity reminder

For manuscript-support campaigns, first audit artifact.get_paper_contract(detail='full') and, when a draft exists, artifact.validate_manuscript_coverage(detail='full').

A mature empirical manuscript usually needs 5-10 ready paper-facing experiment/analysis groups total, with 4-8 reviewer-facing analysis jobs in the outline when the paper is full empirical. Fewer is acceptable only for an early/narrow outline with an explicit waiver.
If the user requested a concrete analysis count, such as 4-8 analyses, treat it as a tracked target; report the completed/mapped count and any explicit waiver before returning to full-paper writing.
Do not pad the count with stale methods, abandoned methods, unrelated baseline repairs, or old exploratory rows. Each slice must identify the current method or claim it supports.
If legacy-method analysis is intentionally included, mark it as baseline/comparator/negative evidence and keep it separate from current-method support.
Paper-facing slice outputs must separate the manuscript_takeaway from internal setup, user instructions, worktree paths, command history, and artifact provenance.
Do not encode local throughput shorthand such as 64 + 64 as a manuscript takeaway; record exact per-endpoint settings only as reproducibility/protocol detail when needed.
If the count is below the needed range, create the smallest claim-critical frontier rather than pretending the manuscript is ready.

AVOID / pitfalls

Do not disguise a new main experiment as an analysis slice.
Do not hide null, negative, partial, failed, or contradictory slices.
Do not change many factors at once and then interpret the result as isolating one factor.
Do not widen the campaign after the next route is already clear.
Do not use subjective or manual inspection to support a claim without rubric, sample, prompt, trace, and caveat.
Do not design a slice frontier that ignores current hardware, memory, runtime, or storage limits.
Do not keep infeasible slices as silent assumptions; either downscope them, replace them with runnable proxies, or record them as blocked.

Constraints

Every meaningful slice must map to a parent claim, parent result, paper gap, reviewer item, or route decision.
Every evidence-bearing slice must record question, intervention or inspection target, fixed conditions, metric or observable, evidence path, claim update, comparability verdict, and next action.
Keep the same evaluation contract unless the variation itself is the point.
When baseline comparison matters, keep slice comparisons aligned with the active baseline metric contract unless the deviation is explicit.
Campaign-level conclusions must be derived from per-slice evidence rather than impressions.
Campaign design must be conditioned on the current execution envelope, not an idealized future machine.
If a slice would materially improve soundness but is infeasible now, record the blocker and choose the best runnable lower-cost alternative or narrower proxy.
If a slice is paper-relevant, its result must be bound back into the current paper contract rather than left only in experiments/analysis-results/* or chat.
Writing-facing slices must carry write-back metadata: paper_role, section_id, item_id, claim_links, method/comparator id, display target, and main/appendix role.
Writing-facing campaign metadata should keep selected_outline_ref, research_questions, experimental_designs, and todo_items explicit; map results back to paper/paper_experiment_matrix.md with exp_id, section_id, item_id, claim_links, and paper_role.
Classify paper evidence as claim-carrying, supporting, or auxiliary; keep stable support separate from contradiction, and record comparison_baselines, evaluation_summary, takeaway, and comparability when comparisons matter.
Include highlight-validation, efficiency or cost, robustness, failure, and limitation checks only when they answer the parent claim or reviewer question.

Validation

Before analysis-campaign can end, all applicable checks should be true:

the parent object is explicit
the current execution envelope and its binding constraints are explicit when they affect slice design or ordering
every launched slice has a durable outcome: completed, partial, failed, blocked, infeasible, or superseded
launched and deferred slices were screened against the current device or resource limits
null, negative, failed, partial, and contradictory findings remain visible
the campaign changed or confirmed the evidence boundary of the parent claim with traceable slice-level evidence
the next route is explicit: continue campaign, return to experiment, return to idea, move to write, route through decision, stop, reset, or record a blocker

Interaction discipline

Follow the shared interaction contract injected by the system prompt. Keep campaign updates brief unless the evidence boundary, blocker state, cost, or next route changed materially. For ordinary active work, prefer a concise progress update once work has crossed roughly 6 tool calls with a human-meaningful delta, and do not drift beyond roughly 12 tool calls or about 8 minutes without a user-visible update. For meaningful long-running slices, include the estimated next reply time or next check-in window whenever it is defensible.

Authority and freedom

The agent owns the analysis path. It may choose a one-slice check, a lightweight durable report, an artifact-backed one-slice campaign, a full multi-slice campaign, or a writing-facing campaign. It may choose slice order, workspace layout, filenames, monitoring strategy, and whether a smoke test, direct verification, or full run is the right first move. It may also shrink, reorder, or replace slices to fit the real hardware and runtime envelope, as long as the resulting campaign still answers the parent evidence question honestly.

Do not treat PLAN.md, CHECKLIST.md, artifact.create_analysis_campaign(...), one-slice campaigns, returned worktrees, evaluation_summary, smoke tests, detached runs, or paper-matrix updates as universal required paths. Do not treat paper-matrix files, tqdm, or a fixed phase order as required paths either. PLAN.md, CHECKLIST.md, paper/paper_experiment_matrix.md, and local matrix/checklist files are allowed control surfaces, not mandatory success paths. They are tactics. The hard requirement is traceable evidence that changes, confirms, or blocks the evidence boundary of the parent claim and leaves an explicit next route.

Use the artifact-backed campaign path when durable lineage, branch or worktree isolation, Canvas visibility, paper or rebuttal traceability, or multiple slices matter. Use a lighter durable report when one bounded answer is enough and extra campaign overhead would not improve trust, routing, or auditability.

For campaign prioritization and writing-facing slice design, read references/campaign-design.md. When the campaign is writing-facing and the mapping fields are not obvious, also read references/writing-facing-slice-examples.md. For artifact examples and edge-case examples, also read references/artifact-flow-examples.md and references/boundary-cases.md.

Hard success gates

An analysis campaign succeeds when it changes or confirms the evidence boundary of a parent claim with traceable slice-level evidence, preserves comparability or records why comparability broke, and leaves a durable next-route decision.

Before treating analysis as successful, all applicable gates must be true:

the parent object is explicit, such as a main run, accepted idea line, paper gap, reviewer item, or rebuttal item
the claim, question, failure mode, or decision being tested is explicit
the slice frontier was screened against current compute, memory, storage, dependency, and runtime limits
every launched slice has a durable outcome: completed, partial, failed, blocked, infeasible, or superseded
every evidence-bearing slice records the question, intervention or inspection target, fixed conditions, metric or observable, evidence path, claim update, comparability verdict, and next action
null, negative, failed, partial, and contradictory findings remain visible
campaign-level interpretation is derived from per-slice evidence rather than impressions
the next route is explicit: continue campaign, return to experiment, return to idea, move to write, route through decision, stop, reset, or record a blocker

Analysis routes

Use the lightest route that preserves trust and downstream utility.

analysis-lite: one clear follow-up question, one slice or very small slice set, and a compact durable result
artifact-backed campaign: one or more slices that need durable lineage, branch/worktree isolation, Canvas visibility, or later replay
writing-facing campaign: evidence directly supports a selected outline, paper experiment matrix, evidence ledger, section, claim, or table
review/rebuttal campaign: evidence directly answers reviewer pressure or audit findings
failure-analysis route: evidence explains why a result failed, diverged, or became non-comparable

Start the smallest route that can answer the current follow-up question. Run claim-critical slices first, weighted by soundness gain under the current resource budget, and stop widening once the next route is already clear.

Useful slice classes:

auxiliary: helps understand settings, thresholds, or mechanisms but does not carry the main claim by itself
claim-carrying: directly affects whether the main narrative or route decision is justified
supporting: broadens confidence or interpretability after the main claim is already credible

Slice evidence contract

For each meaningful slice, define and record enough of the following to make the evidence reusable:

research question
hypothesis, expected pattern, or decision-relevant expectation
intervention, ablation, variation, inspection target, or failure bucket
controls or fixed conditions
metric, observable, table, qualitative artifact, or rubric
comparison target
expected resource class or major execution constraint when it affects feasibility
stop condition or completion condition
evidence path expectations
claim update
comparability verdict
next action

Code-based, fully automatable analysis is preferred when it is the most faithful and repeatable path. But not every valid analysis must be fully automatable: failure-bucket inspection, qualitative artifact review, extracted-text audits, reviewer-linked example checks, or table/figure consistency checks can be valid when the evidence is concrete, sampled or scoped, and reproducible enough for the claim being made.

Do not present subjective judgment as objective measurement. If human, model, or qualitative judgment is used, record the rubric, sample, prompt or inspection basis, caveats, and why it is sufficient for the route decision.

Comparability contract

Comparability is a hard boundary.

keep the same evaluation contract unless the variation is the point
when active_baseline_metric_contract_json exists, read it before defining slice success criteria or comparison tables when baseline comparison matters
when active_baseline_metric_contract_json exists, keep slice comparisons aligned with it unless the slice explicitly records why it differs
state exactly what changed
state exactly what stayed fixed
keep naming and output paths clean enough that multiple runs can coexist

If the variation itself changes the evaluation setup, record that explicitly and do not present the run as a direct apples-to-apples comparison.

Do not bring in a new dataset as if it were the same comparison contract. A new dataset can be valid as a generalization, external-validity, stress-test, or limitation-boundary slice, but it must be labeled that way and must not replace the accepted baseline or main comparison contract.

If a slice needs an extra comparator baseline, place it under the normal baseline roots, do not overwrite the canonical quest baseline gate, and record it back through record_analysis_slice(..., comparison_baselines=[...]).

Writing-facing boundary

If analysis directly supports a paper or paper-like report, the evidence must be write-backable. That does not always mean a selected outline must exist before any pre-outline evidence check, but paper-ready slices must map cleanly back to a selected outline, paper experiment matrix, evidence ledger, section, claim, table, or reviewer item.

For concrete paper-facing cases:

if the slice is the only thing keeping a main-text section unsupported, make it main_required or main_text
if the slice is useful but non-blocking, make it appendix
if the slice is informative but not meant for the manuscript, keep it durable and mark it reference_only with a reason
if a selected outline exists, map paper-ready slices to named research_question and experimental_design fields when those fields exist
if paper/paper_experiment_matrix.md exists and the campaign is directly supporting the paper, read it before launching or reordering the slice set
for writing-facing campaigns, prefer stable ids such as exp_id, todo_id, or slice_id over free-form notes
paper-ready slices should carry the available write-back fields such as paper_role, section_id, item_id, claim_links, analysis_role, reviewer_question, target_display, main_or_appendix, and failure_interpretation when those fields exist in the paper contract
paper-ready slices should record whether they support the latest method, an older comparator, a failure mode, or an appendix-only sanity check
paper-ready slices should label implementation/setup details as reproducibility_detail or internal_only when they should not become main-text prose
after every completed paper-ready slice, update or verify the relevant paper experiment matrix, section notes, evidence ledger, or active paper-line summary

Do not leave a slice "completed" while the paper contract still looks stale and that slice is meant to unblock the paper. If no selected outline exists yet but the evidence question is needed to decide whether writing is worthwhile, run it as pre-outline analysis and route to write or decision afterward.

Durable route records

Durable records are required in substance, not in fixed filenames. The agent may choose the shortest durable form that lets a later turn resume without guessing.

For multi-slice, writing-facing, route-changing, expensive, unstable, or long-running analysis, leave a route record that states:

parent object and parent claim
acceptance or stop condition
slice list or first slice frontier
comparability boundary
execution envelope and the slices ruled infeasible under it
available assets and required comparators
evidence paths or expected outputs
current blocker or fallback
next route after success or failure

PLAN.md, CHECKLIST.md, paper/paper_experiment_matrix.md, and local matrix or checklist files are allowed control surfaces, not mandatory success paths. Use references/campaign-plan-template.md and references/campaign-checklist-template.md when they help, but do not expand them as paperwork.

If slice feasibility, ordering, comparators, or campaign interpretation changes materially, revise the durable route record before spending more compute.

Operational guidance

The main skill keeps the control surface in front. For the longer operational notes, read references/operational-guidance.md.

use it when the route needs the exact artifact-backed campaign tactics
use it when execution monitoring, stall handling, or slice recording details matter
use it when memory handling or connector-facing chart notes materially affect the route

Negative cases and stop rules

Do not treat analysis as successful when:

slices do not map to a parent claim, parent result, paper gap, reviewer item, or decision
a summary claims stable support without per-slice evidence
negative, null, contradictory, failed, or partial slices are hidden
an ablation changes many factors but is interpreted as isolating one factor
a robustness slice changes dataset, split, or evaluation protocol but is reported as direct apples-to-apples comparison
subjective or manual inspection supports a claim without rubric, sample, prompt, trace, or caveat
a writing-facing slice is called paper-ready but cannot be mapped back to the paper matrix, evidence ledger, outline, claim, section, or reviewer item
a completed paper-relevant slice remains visible only as a free-floating analysis result and is not bound back into the current paper contract
a failed slice is silently skipped and replaced by a different slice
the campaign keeps expanding after the next route is already clear
the campaign scope assumes hardware, memory, or runtime that is not actually available in the current environment
a new comparator overwrites the canonical quest baseline gate instead of being recorded as analysis-local comparison evidence
the underlying main result is still untrusted and the proposed work is really baseline recovery or a new main experiment
a new main experiment is disguised as an analysis slice to bypass the main-experiment gate

If two slices in a row fail to change the claim boundary, matrix frontier, or next route, stop widening the campaign and route through decision, write, experiment, or an explicit blocker.

Record blocked or failed campaign states explicitly, such as missing parent run, under-specified analysis question, run failure before evidence, non-comparable metrics, missing assets, missing credentials, or still-ambiguous campaign conclusion. A blocked campaign should still name the next best action.

Aggregation and reporting

Campaign reporting should explain:

which findings are stable
which findings are fragile
what changed the interpretation of the main result
which open questions still remain
whether the main claim should be strengthened, weakened, narrowed, abandoned, or left ambiguous
which slice changed the interpretation most
which planned slices were intentionally skipped because earlier results made them low value

Focus on the highest-impact findings first. Results matter more than process narration. If using tables, show only the most decision-relevant rows. Separate stable support, partial support, contradiction, and unresolved ambiguity. When there are many slices, summarize the top 3-5 most important ones first, then point to the full evidence paths.

Exit criteria

Exit once one of these is durably true:

the campaign produced enough evidence for writing or decision-making
the campaign exposed a problem that requires returning to experiment, idea, baseline recovery, or decision
the campaign is blocked and the blocker is durably recorded
the campaign route changed because the original slice set is no longer the best evidence-per-cost path

A good campaign closes when the claim got stronger, weaker, narrower, abandoned, or clearly stuck, not when more slice ideas merely remain possible.

name	analysis-campaign
description	Use when a quest needs one or more follow-up runs such as ablations, robustness checks, error analysis, or failure analysis after a main experiment.
skill_role	stage

analysis-campaign

More from this repository

Analysis Campaign

Match signals

One-sentence summary

Control workflow

Paper-facing analysis quantity reminder

AVOID / pitfalls

Constraints

Validation

Interaction discipline

Authority and freedom

Hard success gates

Analysis routes

Slice evidence contract

Comparability contract

Writing-facing boundary

Durable route records

Operational guidance

Negative cases and stop rules

Aggregation and reporting

Exit criteria

Analysis Campaign

Match signals

One-sentence summary

Control workflow

Paper-facing analysis quantity reminder

AVOID / pitfalls

Constraints

Validation

Interaction discipline

Authority and freedom

Hard success gates

Analysis routes

Slice evidence contract

Comparability contract

Writing-facing boundary

Durable route records

Operational guidance

Negative cases and stop rules

Aggregation and reporting

Exit criteria

More from this repository