Ejecuta cualquier Skill en Manus
con un clic

Ejecuta cualquier Skill en Manus con un clic

data-analysis

Generate statistical analysis code with 4-round review. Select appropriate statistical tests, interpret results, and produce analysis reports with p-values, effect sizes, and confidence intervals. Use when analyzing experimental data for a paper.

Ejecutar en Manus

Resumen

Comando de instalación

npx skills add https://github.com/lingzhi227/agent-research-skills --skill data-analysis

Copia y pega este comando en Claude Code para instalar la habilidad

Fuente

lingzhi227/agent-research-skills

Estrellas90

Forks16

Actualizado20 de febrero de 2026, 05:22

Explorador de archivos

4 archivos

SKILL.md

readonly

name	data-analysis
description	Generate statistical analysis code with 4-round review. Select appropriate statistical tests, interpret results, and produce analysis reports with p-values, effect sizes, and confidence intervals. Use when analyzing experimental data for a paper.
argument-hint	["data-source"]

Data Analysis

Generate rigorous statistical analysis code with multi-round review.

Input

$0 — Data source (CSV, JSON, pickle, or experiment logs)
$1 — Research goal or hypothesis to test

References

4-round code review prompts: ~/.claude/skills/data-analysis/references/review-prompts.md

Scripts

Statistical summary and comparison

python ~/.claude/skills/data-analysis/scripts/stat_summary.py --input results.csv --compare method --metric accuracy --output summary.json
python ~/.claude/skills/data-analysis/scripts/stat_summary.py --input results.csv --describe

Detects data types, recommends tests, runs comparisons, outputs effect sizes and significance stars. Requires numpy, scipy.

Format p-values

python ~/.claude/skills/data-analysis/scripts/format_pvalue.py --values "0.001 0.05 0.23" --format stars
python ~/.claude/skills/data-analysis/scripts/format_pvalue.py --csv results.csv --column pvalue --format latex

Formats p-values with stars, LaTeX notation, or plain text. Stdlib-only.

Workflow

Step 1: Generate Analysis Code

Structure the code with these sections:

# IMPORT — pandas, numpy, scipy, statsmodels, sklearn
# LOAD DATA — Load from original data files
# DATASET PREPARATIONS — Missing values, units, exclusion criteria
# DESCRIPTIVE STATISTICS — Summary tables if needed
# PREPROCESSING — Dummy variables, normalization
# ANALYSIS — Statistical tests per hypothesis
# SAVE ADDITIONAL RESULTS — Extra results to pickle

Step 2: 4-Round Code Review

Round 1 — Code Flaws: Mathematical/statistical errors, wrong calculations, trivial tests
Round 2 — Data Handling: Missing values, units, preprocessing, test choice
Round 3 — Per-Table: Sensible values, measures of uncertainty, missing data
Round 4 — Cross-Table: Completeness, consistency, missing variables

Step 3: Produce Results

Every nominal value must have uncertainty (CI, STD, or p-value)
Statistical tests must be appropriate for the data type
Results must match actual data — never hallucinate

Allowed Packages

pandas, numpy, scipy, statsmodels, sklearn, pickle

Statistical Test Selection

Data Type	Test
Two groups, normal	Independent t-test
Two groups, non-normal	Mann-Whitney U
Paired samples	Paired t-test / Wilcoxon
Multiple groups	ANOVA / Kruskal-Wallis
Categorical	Chi-square / Fisher's exact
Correlation	Pearson / Spearman
Regression	OLS / Logistic / Mixed effects

Rules

Always report p-values for statistical tests
Account for relevant confounding variables
Use inherent package functionality (e.g., formula = "y ~ a * b" for interactions)
Do not manually implement available statistical functions
Access dataframes using string-based column names, not integer indices

Related Skills

Upstream: experiment-code, experiment-design
Downstream: table-generation, figure-generation, backward-traceability
See also: math-reasoning

Más de este repositorio

mismo repositorio

deep-research

lingzhi227/agent-research-skills

Conduct systematic academic literature reviews in 6 phases, producing structured notes, a curated paper database, and a synthesized final report. Output is organized by phase for clarity.

2026-02-2790

github-research

lingzhi227/agent-research-skills

Explore and analyze GitHub repositories related to a research topic. Reads deep-research output, discovers repos from multiple sources, deeply analyzes code, and produces integration blueprints.

2026-02-2390

citation-management

lingzhi227/agent-research-skills

Manage BibTeX citations for LaTeX papers. Harvest missing citations from a draft using Semantic Scholar, validate cite keys against .bib files, deduplicate entries, and format bibliography. Use when working with references, BibTeX, or citations.

2026-02-2090

experiment-code

lingzhi227/agent-research-skills

Write ML experiment code with iterative improvement. Generate training/evaluation pipelines, debug errors, and optimize results through code reflection. Use when implementing experiments for a research paper.

2026-02-2090

figure-generation

lingzhi227/agent-research-skills

Generate publication-quality scientific figures using matplotlib/seaborn with a three-phase pipeline (query expansion, code generation with execution, VLM visual feedback). Handles bar charts, line plots, heatmaps, training curves, ablation plots, and more. Use when the user needs figures, plots, or visualizations for a paper.

2026-02-2090

idea-generation

lingzhi227/agent-research-skills

Generate novel research ideas with iterative refinement and novelty checking against literature. Score ideas on Interestingness, Feasibility, and Novelty. Use when brainstorming research directions or validating idea novelty.

2026-02-2090

Fuente

lingzhi227

lingzhi227/agent-research-skills

Abrir repositorio de GitHub Ver repositorios del creador

Comando de instalación

Descarga

Ejecutar en Manus

Útil paraSOC

Científicos de datosOcupaciones informáticas y matemáticas15-2051L4

name	data-analysis
description	Generate statistical analysis code with 4-round review. Select appropriate statistical tests, interpret results, and produce analysis reports with p-values, effect sizes, and confidence intervals. Use when analyzing experimental data for a paper.
argument-hint	["data-source"]

Data Analysis

Generate rigorous statistical analysis code with multi-round review.

Input

$0 — Data source (CSV, JSON, pickle, or experiment logs)
$1 — Research goal or hypothesis to test

References

4-round code review prompts: ~/.claude/skills/data-analysis/references/review-prompts.md

Scripts

Statistical summary and comparison

python ~/.claude/skills/data-analysis/scripts/stat_summary.py --input results.csv --compare method --metric accuracy --output summary.json
python ~/.claude/skills/data-analysis/scripts/stat_summary.py --input results.csv --describe

Detects data types, recommends tests, runs comparisons, outputs effect sizes and significance stars. Requires numpy, scipy.

Format p-values

python ~/.claude/skills/data-analysis/scripts/format_pvalue.py --values "0.001 0.05 0.23" --format stars
python ~/.claude/skills/data-analysis/scripts/format_pvalue.py --csv results.csv --column pvalue --format latex

Formats p-values with stars, LaTeX notation, or plain text. Stdlib-only.

Workflow

Step 1: Generate Analysis Code

Structure the code with these sections:

# IMPORT — pandas, numpy, scipy, statsmodels, sklearn
# LOAD DATA — Load from original data files
# DATASET PREPARATIONS — Missing values, units, exclusion criteria
# DESCRIPTIVE STATISTICS — Summary tables if needed
# PREPROCESSING — Dummy variables, normalization
# ANALYSIS — Statistical tests per hypothesis
# SAVE ADDITIONAL RESULTS — Extra results to pickle

Step 2: 4-Round Code Review

Round 1 — Code Flaws: Mathematical/statistical errors, wrong calculations, trivial tests
Round 2 — Data Handling: Missing values, units, preprocessing, test choice
Round 3 — Per-Table: Sensible values, measures of uncertainty, missing data
Round 4 — Cross-Table: Completeness, consistency, missing variables

Step 3: Produce Results

Every nominal value must have uncertainty (CI, STD, or p-value)
Statistical tests must be appropriate for the data type
Results must match actual data — never hallucinate

Allowed Packages

pandas, numpy, scipy, statsmodels, sklearn, pickle

Statistical Test Selection

Data Type	Test
Two groups, normal	Independent t-test
Two groups, non-normal	Mann-Whitney U
Paired samples	Paired t-test / Wilcoxon
Multiple groups	ANOVA / Kruskal-Wallis
Categorical	Chi-square / Fisher's exact
Correlation	Pearson / Spearman
Regression	OLS / Logistic / Mixed effects

Rules

Always report p-values for statistical tests
Account for relevant confounding variables
Use inherent package functionality (e.g., formula = "y ~ a * b" for interactions)
Do not manually implement available statistical functions
Access dataframes using string-based column names, not integer indices

Related Skills

Upstream: experiment-code, experiment-design
Downstream: table-generation, figure-generation, backward-traceability
See also: math-reasoning