Run any Skill in Manus with one click

bioinformatics

Performs bioinformatics analyses including pathway enrichment, gene ontology analysis, protein-protein interaction networks, multi-omics integration, and biological sequence database querying; trigger when users discuss gene sets, biological pathways, functional annotation, or omics data integration.

Run Skill in Manus

Stars850

Forks98

UpdatedMarch 12, 2026 at 04:53

Source

beita6969

beita6969/ScienceClaw

View GitHub Repository View Creator Repositories

Install command

Download

Run Skill in Manus

Useful forSOC

Software DevelopersComputer and Mathematical Occupations15-1252L4

SKILL.md

readonly

name	bioinformatics
description	Performs bioinformatics analyses including pathway enrichment, gene ontology analysis, protein-protein interaction networks, multi-omics integration, and biological sequence database querying; trigger when users discuss gene sets, biological pathways, functional annotation, or omics data integration.

When to Trigger

Activate this skill when the user mentions:

Pathway analysis, KEGG, Reactome, WikiPathways
Gene Ontology (GO) enrichment, biological process, molecular function
Protein-protein interaction (PPI) networks, STRING, BioGRID
Multi-omics integration (transcriptomics + proteomics + metabolomics)
Gene set enrichment analysis (GSEA), over-representation analysis (ORA)
Sequence databases, UniProt, NCBI, Ensembl queries
Single-cell RNA-seq analysis, clustering, trajectory inference

Step-by-Step Methodology

Data preparation - Standardize gene/protein identifiers (convert to Entrez, Ensembl, or UniProt IDs as needed). Remove duplicates and handle ambiguous mappings. Verify organism and genome build.
Differential analysis - For transcriptomics: DESeq2 or edgeR (count data), limma-voom (normalized). For proteomics: limma with appropriate normalization. Apply multiple testing correction (BH-FDR). Set thresholds (|log2FC| > 1, padj < 0.05 as defaults, adjustable).
Functional enrichment - Perform GO enrichment (BP, MF, CC) using clusterProfiler, g:Profiler, or DAVID. Run KEGG/Reactome pathway enrichment. Use GSEA for ranked gene lists (no arbitrary cutoff). Report enriched terms with gene ratio, p-value, adjusted p-value, and gene members.
Network analysis - Build PPI networks from STRING (confidence > 0.7 for high confidence). Identify hub genes (degree centrality), bottleneck nodes (betweenness centrality), and functional modules (MCODE, Louvain clustering). Overlay expression data on network.
Multi-omics integration - For paired omics: correlation analysis, canonical correlation (CCA), or MOFA/DIABLO. Map features across omics layers using shared identifiers or known biological connections. Identify convergent pathways.
Single-cell analysis - QC filtering (genes/cell, UMI/cell, mitochondrial %). Normalization (scran, SCTransform). Dimensionality reduction (PCA, UMAP). Clustering (Leiden, Louvain). Cell type annotation (SingleR, scType, marker genes). Trajectory inference (Monocle3, Slingshot).
Visualization - Generate volcano plots, heatmaps (with hierarchical clustering), dot plots (enrichment), network diagrams, UMAP/tSNE plots (single-cell), and circos plots (multi-omics).

Key Databases and Tools

Gene Ontology (GO) - Functional annotations
KEGG / Reactome / WikiPathways - Pathway databases
STRING / BioGRID / IntAct - PPI databases
Ensembl / NCBI / UniProt - Sequence and annotation databases
clusterProfiler / g:Profiler / DAVID - Enrichment tools
Seurat / Scanpy - Single-cell analysis frameworks
Cytoscape - Network visualization

Output Format

Enrichment results as tables: term, description, gene ratio, p-value, padj, gene list.
Volcano plots with labeled significant genes and fold-change thresholds.
Network figures with node coloring (expression), size (degree), and module highlighting.
UMAP/tSNE plots with cluster labels and cell type annotations.
Heatmaps with dendrograms and annotation bars.

Quality Checklist

Gene ID mapping verified (conversion losses reported)
Background gene set appropriate for enrichment analysis
Multiple testing correction applied (BH-FDR or equivalent)
Redundant GO terms handled (semantic similarity, REVIGO)
Network confidence threshold specified and justified
Single-cell QC thresholds documented
Batch effects assessed and corrected if present
Results cross-validated across databases or methods
Biological interpretation grounded in literature

More from this repository

same repository

academic-literature-search

beita6969/ScienceClaw

# Academic Literature Search — 学术文献检索与引用管理

2026-03-12850

arxiv-search

beita6969/ScienceClaw

Search arXiv for preprints in physics, math, CS, quantitative biology, quantitative finance, statistics, electrical engineering, economics. Use when: (1) finding preprints by topic, (2) searching by author, (3) browsing arXiv categories, (4) getting paper metadata/abstracts. NOT for: published journal articles (use crossref-search), biomedical (use pubmed-search).

2026-03-12850

asreview-screening

beita6969/ScienceClaw

Screen papers for systematic reviews using ASReview active learning. Use when: user has a large set of papers to screen for inclusion/exclusion, wants to prioritize relevant papers, or needs to reduce manual screening workload. NOT for: searching papers (use literature-search) or meta-analysis (use meta-analysis).

2026-03-12850

astronomy-cosmology

beita6969/ScienceClaw

Analyzes astronomical observations and cosmological models including telescope data processing, celestial mechanics calculations, stellar evolution, galaxy classification, and cosmological parameter estimation; trigger when users discuss stars, galaxies, exoplanets, dark matter, or the universe's large-scale structure.

2026-03-12850

astropy-astronomy

beita6969/ScienceClaw

"Astronomical computations via Astropy. Use when: user asks about celestial coordinates, FITS files, or cosmological calculations. NOT for: telescope control or real-time observation planning."

2026-03-12850

biopython-bio

beita6969/ScienceClaw

"Bioinformatics operations via Biopython. Use when: user asks about DNA/protein sequences, BLAST, or PDB structures. NOT for: clinical genomics or variant calling pipelines."

2026-03-12850

name	bioinformatics
description	Performs bioinformatics analyses including pathway enrichment, gene ontology analysis, protein-protein interaction networks, multi-omics integration, and biological sequence database querying; trigger when users discuss gene sets, biological pathways, functional annotation, or omics data integration.

When to Trigger

Activate this skill when the user mentions:

Pathway analysis, KEGG, Reactome, WikiPathways
Gene Ontology (GO) enrichment, biological process, molecular function
Protein-protein interaction (PPI) networks, STRING, BioGRID
Multi-omics integration (transcriptomics + proteomics + metabolomics)
Gene set enrichment analysis (GSEA), over-representation analysis (ORA)
Sequence databases, UniProt, NCBI, Ensembl queries
Single-cell RNA-seq analysis, clustering, trajectory inference

Step-by-Step Methodology

Data preparation - Standardize gene/protein identifiers (convert to Entrez, Ensembl, or UniProt IDs as needed). Remove duplicates and handle ambiguous mappings. Verify organism and genome build.
Differential analysis - For transcriptomics: DESeq2 or edgeR (count data), limma-voom (normalized). For proteomics: limma with appropriate normalization. Apply multiple testing correction (BH-FDR). Set thresholds (|log2FC| > 1, padj < 0.05 as defaults, adjustable).
Functional enrichment - Perform GO enrichment (BP, MF, CC) using clusterProfiler, g:Profiler, or DAVID. Run KEGG/Reactome pathway enrichment. Use GSEA for ranked gene lists (no arbitrary cutoff). Report enriched terms with gene ratio, p-value, adjusted p-value, and gene members.
Network analysis - Build PPI networks from STRING (confidence > 0.7 for high confidence). Identify hub genes (degree centrality), bottleneck nodes (betweenness centrality), and functional modules (MCODE, Louvain clustering). Overlay expression data on network.
Multi-omics integration - For paired omics: correlation analysis, canonical correlation (CCA), or MOFA/DIABLO. Map features across omics layers using shared identifiers or known biological connections. Identify convergent pathways.
Single-cell analysis - QC filtering (genes/cell, UMI/cell, mitochondrial %). Normalization (scran, SCTransform). Dimensionality reduction (PCA, UMAP). Clustering (Leiden, Louvain). Cell type annotation (SingleR, scType, marker genes). Trajectory inference (Monocle3, Slingshot).
Visualization - Generate volcano plots, heatmaps (with hierarchical clustering), dot plots (enrichment), network diagrams, UMAP/tSNE plots (single-cell), and circos plots (multi-omics).

Key Databases and Tools

Gene Ontology (GO) - Functional annotations
KEGG / Reactome / WikiPathways - Pathway databases
STRING / BioGRID / IntAct - PPI databases
Ensembl / NCBI / UniProt - Sequence and annotation databases
clusterProfiler / g:Profiler / DAVID - Enrichment tools
Seurat / Scanpy - Single-cell analysis frameworks
Cytoscape - Network visualization

Output Format

Enrichment results as tables: term, description, gene ratio, p-value, padj, gene list.
Volcano plots with labeled significant genes and fold-change thresholds.
Network figures with node coloring (expression), size (degree), and module highlighting.
UMAP/tSNE plots with cluster labels and cell type annotations.
Heatmaps with dendrograms and annotation bars.

Quality Checklist

Gene ID mapping verified (conversion losses reported)
Background gene set appropriate for enrichment analysis
Multiple testing correction applied (BH-FDR or equivalent)
Redundant GO terms handled (semantic similarity, REVIGO)
Network confidence threshold specified and justified
Single-cell QC thresholds documented
Batch effects assessed and corrected if present
Results cross-validated across databases or methods
Biological interpretation grounded in literature