Run any Skill in Manus with one click

statistical-experimental-evaluation

Stars13,557

Forks1,589

UpdatedMay 20, 2026 at 04:39

Design and run statistical experiments that test the formal problem, proposed methods, theoretical predictions, baselines, and ablations.

Installation

Install with Codex or Claude Copy this prompt, paste it into Codex, Claude, or another assistant, and let it review the skill page and install it for you.

Run Skill in Manus

Source

aiming-lab

aiming-lab/AutoResearchClaw

View GitHub Repository View Creator Repositories

Download

Run Skill in Manus

Related occupationsSOC

Based on SOC occupation classification

StatisticiansComputer and Mathematical Occupations·SOC 15-2041

SKILL.md

readonly

name	statistical-experimental-evaluation
description	Design and run statistical experiments that test the formal problem, proposed methods, theoretical predictions, baselines, and ablations.
metadata	{"category":"domain","trigger-keywords":"experiment,simulation,evaluation,comparison,baseline,ablation,metrics,diagnostics,statistical evidence","applicable-stages":"7,8,9,10,11,12,13,14","priority":"1"}

Statistical Experimental Evaluation

Overview

Use this skill after formulation, method proposal, and theory. Experiments should test specific claims and theoretical predictions.

Experiment Plan

Define:

Conditions or data-generating processes
Real data source or synthetic data generator
Sample sizes, folds, repetitions, seeds, or resamples
Proposed method
Baselines
Ablations
Diagnostics
Metrics
Failure accounting

Required Artifacts

experiments/<TOPIC_ID>/config.yaml
experiments/<TOPIC_ID>/src/
experiments/<TOPIC_ID>/results/metrics.json
experiments/<TOPIC_ID>/results/run_manifest.json
experiments/<TOPIC_ID>/results/comparison_summary.md
experiments/<TOPIC_ID>/results/claim_verdicts.json
experiments/<TOPIC_ID>/report/paper.md
experiments/<TOPIC_ID>/README.md

Evidence Schema

Use a row-oriented metric format:

{
  "topic_id": "TXX",
  "metric_rows": [
    {
      "claim_id": "C1",
      "method": "proposed_method",
      "baseline": "standard_method",
      "condition": "stress_condition",
      "metric": "risk",
      "value": 0.12,
      "status": "ok"
    }
  ]
}

Claim verdicts should connect theory and experiments:

[
  {
    "claim_id": "C1",
    "verdict": "supported",
    "theory_support": "Proposition 1 under A1-A3",
    "experimental_support": "Proposed method has lower risk in conditions X-Y",
    "comparison": "Outperforms baseline B on metric M",
    "limitations": "Finite sample only; assumption A2 not tested"
  }
]

Evidence Rules

A metric must map to a formulated claim.
A comparison must use the same data conditions across methods.
Failed runs must be counted.
Runtime reductions must be recorded.
Results must be interpreted against theoretical predictions.

More from this repository

same repository

fba-simulator

aiming-lab/AutoResearchClaw

Run Flux Balance Analysis (FBA) and related constraint-based simulations using COBRApy. Covers standard FBA, parsimonious FBA (pFBA), Flux Variability Analysis (FVA), loopless FBA, gene/reaction knockouts, and carbon source swapping. Outputs flux distributions and CSV files.

2026-05-2013.6k

flux-analyzer

aiming-lab/AutoResearchClaw

Analyse FBA flux distributions to extract biological insights. Covers gene essentiality, phenotypic phase planes, flux sampling, pathway-level aggregation, secretion product prediction, and production of publication- quality figures.

2026-05-2013.6k

gsmm-builder

aiming-lab/AutoResearchClaw

Build or load a genome-scale metabolic model (GSMM) using COBRApy. Covers loading from BIGG, constructing minimal models from scratch, setting medium constraints, and exporting validated .json model files.

2026-05-2013.6k

gsmm-validator

aiming-lab/AutoResearchClaw

Validate a COBRApy genome-scale metabolic model for mass/charge balance, stoichiometric consistency, biomass producibility, dead-end metabolites, thermodynamic loops, and GPR rule formatting. Outputs a structured validation report with errors and warnings.

2026-05-2013.6k

metabolic-study-planner

aiming-lab/AutoResearchClaw

Plan publishable constraint-based metabolic modelling studies when the user has a broad biological or metabolic-engineering topic but no concrete dataset, organism, model, or hypothesis. Selects feasible BiGG/COBRA models, objectives, perturbations, analyses, metrics, figures, and risk controls before FBA code is generated.

2026-05-2013.6k

mfa-pipeline-orchestrator

aiming-lab/AutoResearchClaw

Orchestrate the full metabolic flux analysis pipeline from model loading to phenotype prediction and publication figures. Triggers when the user provides an organism name, BIGG model ID, or custom reaction list and wants end-to-end metabolic modelling run automatically.

2026-05-2013.6k

name	statistical-experimental-evaluation
description	Design and run statistical experiments that test the formal problem, proposed methods, theoretical predictions, baselines, and ablations.
metadata	{"category":"domain","trigger-keywords":"experiment,simulation,evaluation,comparison,baseline,ablation,metrics,diagnostics,statistical evidence","applicable-stages":"7,8,9,10,11,12,13,14","priority":"1"}

Statistical Experimental Evaluation

Overview

Use this skill after formulation, method proposal, and theory. Experiments should test specific claims and theoretical predictions.

Experiment Plan

Define:

Conditions or data-generating processes
Real data source or synthetic data generator
Sample sizes, folds, repetitions, seeds, or resamples
Proposed method
Baselines
Ablations
Diagnostics
Metrics
Failure accounting

Required Artifacts

experiments/<TOPIC_ID>/config.yaml
experiments/<TOPIC_ID>/src/
experiments/<TOPIC_ID>/results/metrics.json
experiments/<TOPIC_ID>/results/run_manifest.json
experiments/<TOPIC_ID>/results/comparison_summary.md
experiments/<TOPIC_ID>/results/claim_verdicts.json
experiments/<TOPIC_ID>/report/paper.md
experiments/<TOPIC_ID>/README.md

Evidence Schema

Use a row-oriented metric format:

{
  "topic_id": "TXX",
  "metric_rows": [
    {
      "claim_id": "C1",
      "method": "proposed_method",
      "baseline": "standard_method",
      "condition": "stress_condition",
      "metric": "risk",
      "value": 0.12,
      "status": "ok"
    }
  ]
}

Claim verdicts should connect theory and experiments:

[
  {
    "claim_id": "C1",
    "verdict": "supported",
    "theory_support": "Proposition 1 under A1-A3",
    "experimental_support": "Proposed method has lower risk in conditions X-Y",
    "comparison": "Outperforms baseline B on metric M",
    "limitations": "Finite sample only; assumption A2 not tested"
  }
]

Evidence Rules

A metric must map to a formulated claim.
A comparison must use the same data conditions across methods.
Failed runs must be counted.
Runtime reductions must be recorded.
Results must be interpreted against theoretical predictions.