Execute qualquer Skill no Manus
com um clique

Execute qualquer Skill no Manus com um clique

pyopenms

Complete mass spectrometry analysis platform. Use for proteomics and metabolomics workflows—feature detection, peptide/protein identification, label-free and isobaric quantification, adduct/accurate-mass annotation, and complex LC-MS/MS pipelines. Supports extensive file formats and algorithms. For simple spectral comparison and small-molecule library matching use matchms.

Executar no Manus

Estrelas28.402

Forks2.909

Atualizado12 de junho de 2026 às 14:04

Fonte

K-Dense-AI

K-Dense-AI/scientific-agent-skills

Abrir repositório GitHub Ver repositórios do creator

Comando de instalação

Download

Executar no Manus

Útil paraSOC

Desenvolvedores de softwareInformática e Matemática15-1252L4

Explorador de arquivos

23 arquivos

SKILL.md

readonly

Mais deste repositório

mesmo repositório

database-lookup

K-Dense-AI/scientific-agent-skills

Deterministically query 78 public scientific, biomedical, materials science, regulatory, finance, and demographics databases through documented REST APIs. Use for reproducible lookups of compounds, genes, proteins, pathways, variants, clinical trials, patents, economic indicators, structures, astronomy objects, environmental records, or database-backed scientific facts when endpoints, filters, pagination, and provenance need to be explicit.

2026-06-1528.4k

transformers

K-Dense-AI/scientific-agent-skills

Hugging Face Transformers for loading Hub models, running pipeline inference, text generation, and Trainer fine-tuning on NLP, vision, audio, and multimodal tasks. Use when working with AutoModel, pipelines, tokenizers, or TrainingArguments—not for general ML outside the Transformers library.

2026-06-1328.4k

arbor

K-Dense-AI/scientific-agent-skills

Autonomously improve a real artifact (code, training recipe, agent harness, data pipeline, prompt) against an objective and an evaluator, using Hypothesis Tree Refinement (HTR) from the Arbor paper. Use this whenever someone wants to iteratively optimize something over many experiments without overfitting — e.g. "get my model's eval score up", "improve this agent/harness", "tune this pipeline", "beat the baseline on this benchmark", "run a search over approaches and keep the best", "do an MLE-bench / Kaggle-style optimization", or any long-horizon "make this artifact better and don't just memorize the dev set" task. Trigger it even when the user doesn't say "Arbor" or "hypothesis tree" but describes repeated experiment-and-evaluate loops, branching exploration of competing ideas, or worries about a dev/test gap. Runs Claude itself as the coordinator with subagent executors in isolated git worktrees; for the standalone `arbor` CLI tool see references/arbor-upstream.md.

2026-06-1228.4k

scanpy

K-Dense-AI/scientific-agent-skills

Standard single-cell RNA-seq analysis pipeline. Use for QC, normalization, dimensionality reduction (PCA/UMAP/t-SNE), clustering, differential expression, visualization, and converting R-friendly single-cell formats such as Seurat or SingleCellExperiment RDS files into h5ad for Scanpy. Best for exploratory scRNA-seq analysis with established workflows. For deep learning models use scvi-tools; for data format questions use anndata.

2026-06-1228.4k

experimental-design

K-Dense-AI/scientific-agent-skills

Design experiments and studies BEFORE data is collected — choosing a design, randomizing, blocking, and laying out treatment combinations so the results will actually be interpretable. Use whenever someone is planning a study, asks how to assign subjects/samples to groups, mentions randomization, blocking, stratification, controls, factorial or fractional-factorial designs, design of experiments (DOE), screening many factors, response-surface optimization, crossover or repeated-measures or split-plot designs, cluster/group randomization, Latin squares, plate layouts, batch/run-order effects, replication vs. pseudoreplication, or sequential/adaptive/group-sequential designs. Trigger this even for informal phrasings like "how should I set up this experiment", "how do I avoid confounding", "what's the best way to test these 6 factors", or "assign these mice to conditions". For computing the sample size or power once the design is chosen, use statistical-power; for analyzing data already collected, use statistica

2026-06-1128.4k

statistical-power

K-Dense-AI/scientific-agent-skills

Sample-size and statistical power calculations for planning studies. Use whenever someone asks "how many subjects/samples/replicates do I need", wants an a priori power analysis, a minimum detectable effect (MDE), a power curve, or needs to justify a sample size for a grant, IRB protocol, or pre-registration. Covers closed-form power for t-tests, ANOVA, proportions, correlations, chi-square, and regression, plus simulation-based (Monte Carlo) power for designs with no formula — logistic/Poisson regression, mixed models, cluster-randomized trials, survival, and interactions. Use this skill even when the request only mentions an effect size, alpha, or "80% power" without saying "power analysis" explicitly. For laying out the study (randomization, blocking, factorial/DOE, crossover, sequential designs) use experimental-design; for analyzing data already collected and reporting it use statistical-analysis.

2026-06-1128.4k

name	pyopenms
description	Complete mass spectrometry analysis platform. Use for proteomics and metabolomics workflows—feature detection, peptide/protein identification, label-free and isobaric quantification, adduct/accurate-mass annotation, and complex LC-MS/MS pipelines. Supports extensive file formats and algorithms. For simple spectral comparison and small-molecule library matching use matchms.
license	3 clause BSD license
allowed-tools	Read Write Edit Bash
compatibility	Requires Python 3.9+ and uv. Examples and scripts target pyOpenMS 3.5.0.
metadata	{"version":"2.0","skill-author":"K-Dense Inc."}

PyOpenMS

Overview

PyOpenMS provides Python bindings to the OpenMS library for computational mass spectrometry, enabling analysis of proteomics and metabolomics data. Use it to read/write MS file formats, process raw spectra, detect and quantify features, identify peptides and proteins, and run end-to-end LC-MS/MS pipelines.

This skill ships ready-to-run scripts in scripts/ covering the most common high-level workflows. Prefer running a script over writing new code—each is a parameterized CLI tool that handles loading, processing, and export. Drop into the Python API (and the references/) only when no script fits.

Installation

uv pip install pyopenms

Verify (note: __version__ works, but the bundled binary prints a one-line memory-status notice on import that is harmless):

import pyopenms as ms
print(ms.__version__)  # 3.5.0

Scripts (start here)

Run with python scripts/<name>.py --help for full options. All accept standard MS file formats and write featureXML/consensusXML/CSV/mzTab/PNG as appropriate.

Inspect & convert

Script	What it does
`inspect_ms_data.py`	Summarize any mzML/mzXML/featureXML/consensusXML/idXML (counts, RT/m/z ranges, TIC, metadata); optional per-spectrum CSV.
`convert_format.py`	Convert between mzML/mzXML/MGF with optional MS-level, RT, and intensity filtering.
`process_spectra.py`	Configurable signal-processing chain: smoothing (Gauss/SGolay), centroiding (PeakPickerHiRes), normalization, S/N and intensity thresholds.

Feature detection & quantification

Script	What it does
`detect_features_metabo.py`	Untargeted metabolomics feature finding: MassTraceDetection → ElutionPeakDetection → FeatureFindingMetabo.
`detect_features_centroided.py`	Peptide/centroided feature detection via FeatureFinderAlgorithmPicked.
`align_link_quantify.py`	Multi-sample pipeline: detect (or load) features → RT alignment → consensus linking → quant matrix CSV.
`consensus_to_matrix.py`	consensusXML → wide intensity matrix + metadata, with optional median/quantile normalization and long format.

Annotation

Script	What it does
`detect_adducts.py`	Group adducts/charge variants of the same neutral mass (MetaboliteFeatureDeconvolution).
`accurate_mass_search.py`	Annotate features against HMDB by accurate mass (AccurateMassSearchEngine → mzTab/CSV).
`export_gnps_sirius.py`	Export GNPS FBMN inputs (MGF + quant table) or a SIRIUS `.ms` file.

Identification

Script	What it does
`process_identifications.py`	Re-index against FASTA, estimate FDR/q-values, filter (FDR/length/best-per-spectrum), export idXML + CSV.

Chemistry

Script	What it does
`mass_calculator.py`	Monoisotopic/average mass, charged m/z, formula, and isotope pattern for peptides or empirical formulas.
`digest_protein.py`	In-silico protease digestion of FASTA/sequence → theoretical peptides with masses and m/z.
`theoretical_spectrum.py`	Generate annotated theoretical fragment spectra (b/y/a/c/x/z, losses) for a peptide.

Targeted & visualization

Script	What it does
`extract_chromatograms.py`	Build TIC/BPC and XIC traces for target m/z (CSV + optional plot).
`plot_ms_data.py`	Quick plots: single spectrum, TIC, 2D feature map, MS1 signal map.

Common script recipes

# Inspect a file
python scripts/inspect_ms_data.py sample.mzML --spectra-csv spectra.csv

# Untargeted metabolomics: features for one sample
python scripts/detect_features_metabo.py sample.mzML --out-csv features.csv

# Full multi-sample quantification study
python scripts/align_link_quantify.py s1.mzML s2.mzML s3.mzML --out-prefix study
python scripts/consensus_to_matrix.py study.consensusXML --out quant.csv --normalize median

# Peptide chemistry
python scripts/mass_calculator.py --peptide "PEPTIDEM(Oxidation)K" --charges 1 2 3 --isotopes 5
python scripts/digest_protein.py proteins.fasta --enzyme Trypsin --missed 2 --out peptides.csv

# Identification post-processing
python scripts/process_identifications.py search.idXML --fasta db.fasta --fdr 0.01 --out filtered.idXML --csv hits.csv

Key 3.5.0 API notes

These changed from older OpenMS releases—older tutorials and code will break:

Feature finding: FeatureFinder("centroided") was removed. Use FeatureFinderAlgorithmPicked (proteomics/centroided) or the MassTraceDetection → ElutionPeakDetection → FeatureFindingMetabo pipeline (metabolomics). See detect_features_*.py.
idXML I/O: IdXMLFile().load/store require a ms.PeptideIdentificationList() for peptide IDs (a plain Python list raises "can not handle type"). Protein IDs remain a plain list.
Adduct decharging: the class is MetaboliteFeatureDeconvolution, and adducts use Elements:Charge:Probability syntax (e.g. H:+:0.4, H-2O-1:0:0.05)—not bracket notation like [M+H]+.
DataFrame columns: FeatureMap.get_df() uses lowercase rt/mz (not RT). ConsensusMap provides get_intensity_df() and get_metadata_df().
Bundled data caveat: the pip wheel ships HMDBMappingFile.tsv but not HMDB2StructMapping.tsv; accurate_mass_search.py detects this and explains how to supply it.

Core data structures

MSExperiment – collection of spectra and chromatograms
MSSpectrum / MSChromatogram – a single spectrum / chromatographic trace
Feature / FeatureMap – a detected LC-MS peak / collection of features
ConsensusMap – features linked across samples (the quant table)
PeptideIdentification / ProteinIdentification – search results
AASequence / EmpiricalFormula – sequence and formula chemistry

For details: see references/data_structures.md.

Parameter management

Most algorithms expose an OpenMS Param object:

algo = ms.FeatureFindingMetabo()
p = algo.getDefaults()
for key in p.keys():
    print(key.decode(), "=", p.getValue(key), "|", p.getDescription(key))
p.setValue("charge_lower_bound", 1)
algo.setParameters(p)

Export to pandas

fm = ms.FeatureMap(); ms.FeatureXMLFile().load("features.featureXML", fm)
df = fm.get_df()             # columns include lowercase rt, mz, intensity, charge, quality

cm = ms.ConsensusMap(); ms.ConsensusXMLFile().load("study.consensusXML", cm)
intensities = cm.get_intensity_df()   # features x samples
metadata = cm.get_metadata_df()       # rt, mz, charge, quality, ...

Integration with other tools

Pandas (DataFrames), NumPy (peak arrays), scikit-learn (ML), Matplotlib/Seaborn (plots), and downstream tools via export: GNPS (FBMN), SIRIUS, and mzTab.

Resources

Official docs (3.5.0): https://pyopenms.readthedocs.io/en/release-3.5.0/
OpenMS: https://www.openms.org
GitHub: https://github.com/OpenMS/OpenMS

References

references/file_io.md – file format handling
references/signal_processing.md – signal processing algorithms
references/feature_detection.md – feature detection and linking
references/identification.md – peptide and protein identification
references/metabolomics.md – metabolomics-specific workflows
references/data_structures.md – core objects and data structures