Execute qualquer Skill no Manus
com um clique

Execute qualquer Skill no Manus com um clique

$pwd:

database-access-skills-index

Name: Database Access Skills Index
Author: aristoteleo

// Skills for querying and downloading data from genomic, transcriptomic, 3D-genome, and cancer-genomics databases. Covers programmatic access to public repositories, gene annotation, sequence retrieval, processed functional-genomics tracks, Hi-C / Micro-C contact matrices, TCGA-style cohorts, and large-scale single-cell data.

Executar no Manus

$ git log --oneline --stat

stars:436

forks:57

updated:20 de maio de 2026 às 07:30

Explorador de arquivos

8 arquivos

SKILL.md

readonly

id	database_access_index
name	Database Access Skills Index
description	Skills for querying and downloading data from genomic, transcriptomic, 3D-genome, and cancer-genomics databases. Covers programmatic access to public repositories, gene annotation, sequence retrieval, processed functional-genomics tracks, Hi-C / Micro-C contact matrices, TCGA-style cohorts, and large-scale single-cell data.

Database Access Skills

Tools and workflows for accessing public biological databases, retrieving sequencing data, querying gene/protein information, pulling processed functional / 3D-genome tracks, fetching cancer-cohort data, and downloading large-scale single-cell datasets.

Quick map — which skill for what

If you need...	Use
Gene / protein / variant metadata (Ensembl, UniProt, NCBI, …)	gget
Raw sequencing reads (FASTQ) from SRA/ENA/GEO/DDBJ/GSA	iSeq
Large-scale single-cell RNA-seq matrices	CELLxGENE Census
Processed functional-genomics tracks (ChIP/ATAC/DNase/RNA-seq bigWig/BAM/peaks)	ENCODE
3D-genome contact matrices (Hi-C / Micro-C / ChIA-PET .mcool / .hic)	4DN
liftOver, UCSC tracks, sequence pulls, large genome catalog	UCSC
Cancer-cohort RNA-seq counts, MAF mutations, CNV, methylation (TCGA / CPTAC)	GDC

The new four (ENCODE / 4DN / UCSC / GDC) are mostly orthogonal to the existing three — they cover processed tracks, 3D genome data, coordinate utilities, and cancer cohorts that gget / iSeq / Census don't reach.

Available Skills

gget — Genomic Database Querying

Python package and CLI tool with 23 interoperable modules for efficiently querying genomic databases including Ensembl, NCBI, UniProt, ARCHS4, Enrichr, COSMIC, OpenTargets, CellxGene, cBioPortal, PDB, and Bgee.

Skill file: gget.md

When to use:

Fetching reference genome/annotation download links (Ensembl)
Searching genes by keyword or retrieving gene metadata
Running BLAST/DIAMOND sequence alignment
Performing enrichment analysis (GO, KEGG, pathway)
Querying cancer mutations (COSMIC) or drug-target associations (OpenTargets)
Retrieving single-cell data from CZ CELLxGENE Discover
Looking up protein structures (PDB, AlphaFold)
Finding tissue expression patterns (ARCHS4, Bgee)
Plotting cancer genomics heatmaps (cBioPortal)

iSeq — Sequencing Data Download

Bash CLI tool for downloading sequencing data and metadata from five public databases (GSA, SRA, ENA, DDBJ, GEO) through a single unified interface. Supports parallel downloads, Aspera transfers, and automatic format conversion.

Skill file: iseq.md

When to use:

Downloading raw sequencing data (FASTQ/SRA) from public repositories
Fetching metadata for projects, experiments, or runs
Downloading from Chinese GSA database (CRA/CRR accessions)
Batch downloading multiple accessions from a file
Converting SRA files to FASTQ format
Merging FASTQ files by experiment, sample, or study

CZ CELLxGENE Census — Single-Cell RNA-seq Data Access

Cloud-based Python API for accessing 217M+ single-cell RNA-seq observations from CZ CELLxGENE Discover via TileDB-SOMA. Supports flexible metadata queries, gene filtering, and pre-computed embeddings (scVI, Geneformer).

Skill file: cellxgene_census.md

When to use:

Querying large-scale single-cell RNA-seq data by tissue, cell type, disease
Downloading count matrices as AnnData objects with metadata filters
Accessing pre-computed embeddings (scVI, Geneformer)
Finding which datasets contain specific genes or cell types
Working with larger-than-memory single-cell datasets via streaming
Exploring CZ CELLxGENE Discover catalog programmatically

ENCODE — Functional Genomics Tracks (ChIP/ATAC/DNase/RNA-seq)

Query and download from the ENCODE Portal — standardised processed files (BAM, bigWig, narrowPeak) for ChIP-seq, ATAC-seq, DNase-seq, RNA-seq, eCLIP, etc. across human / mouse cell types and tissues. Pairs with the igv and gosling LiveViews.

Skill file: encode.md

When to use:

"Find CTCF ChIP-seq peaks in K562 from ENCODE" — TF / histone / chromatin data by target + biosample
ENCODE bigWig signal → IGV track for a paper-quality view
Cross-cell-line / cross-tissue comparison of a regulatory assay
Anywhere you need ENCODE's processed outputs, not raw FASTQ

4DN — 3D Genome Data (Hi-C, Micro-C, ChIA-PET)

Query and download from the 4D Nucleome Data Portal — Hi-C variants, Micro-C, ChIA-PET, SPRITE, GAM, FISH. Returns .mcool / .hic contact matrices and .pairs.gz files. Pairs with the gosling LiveView via the HiGlass back-end, and with cooler / pairix for Python analysis.

Skill file: fourdn.md

When to use:

3D-genome contact map for a cell line / condition
.mcool for use with Cooler / HiGlass / Gosling
ChIA-PET loop calls for a TF
Comparative Hi-C across replicates / conditions

UCSC — Genome Browser API + liftOver

Programmatic access to UCSC's REST API (assemblies, tracks, sequence, gene-symbol search, Track Hubs) and the canonical liftOver coordinate- conversion utility. Use UCSC where Ensembl / NCBI fall short: cross- assembly liftOver, public Track Hubs, the large UCSC assembly catalog (panTro6, calJac4, etc.).

Skill file: ucsc.md

When to use:

liftOver a BED / VCF / single coordinate between assemblies
Find a public Track Hub for a non-model organism
One-off sequence / track pulls without writing pysam boilerplate
Browse UCSC's assembly catalog (assemblies Ensembl doesn't have)

GDC — NCI Genomic Data Commons (TCGA + friends)

Query and download from the NCI GDC — TCGA, CPTAC, TARGET, FM-AD cancer cohorts. Open-tier files (RNA-seq counts, MAF mutations, CNV, methylation, clinical) need no auth; controlled-tier (BAM, gVCF) needs a dbGaP / NCI token. Complements gget cbio (processed cBioPortal view) with the raw GDC files.

Skill file: gdc.md

When to use:

TCGA RNA-seq counts for a project — differential expression input
MAF mutation file for an entire TCGA cohort
CNV / methylation / clinical metadata from TCGA / CPTAC / TARGET
Token-required BAM / gVCF download

Using Skills

Identify your goal: use the quick-map table above to pick the right portal (metadata vs raw reads vs processed tracks vs 3D-genome vs cancer cohort vs single-cell vs coordinate utilities).
Load skill file: Read the full skill document for detailed guidance on its API, filter patterns, and viewer wiring.
Follow examples: each skill's recipes section is copy-paste-able.
Combine tools: they're orthogonal by design — e.g. gget to look up a gene's coordinates → ENCODE for ChIP-seq tracks at that locus → IGV LiveView to display; or GDC for TCGA RNA-seq counts → gget for gene-set enrichment on the result.
Viewer wiring: every new skill includes a "Wire it to a viewer" section showing how to pipe its files into the igv / gosling / molstar LiveViews (download → serve_local_data → viewer track).

related-skills.json

mesmo repositório

rare-disease-skills-index.md

from "aristoteleo/PantheonOS"

Skills for rare disease case support: ontology-first normalization/retrieval and the clinical genetics consult report format contract (structure + theme). Load the relevant skill file when performing the matching task.

2026-05-31436

paper-writing-scenario-router.md

from "aristoteleo/PantheonOS"

Scenario router for paper-writing tasks. Use after root triage to select paper submission, journal article, conference paper, grant proposal, lab report, group report, talk/workshop, or revision-response behavior.

2026-05-29436

paper-writing-skills-index.md

from "aristoteleo/PantheonOS"

Routing and workflow skill family for paper-writing tasks. Covers manuscript drafting, journal and conference papers, grant proposals, lab reports, group-meeting reports, talks, workshop notes, reviewer rebuttals, academic HTML/PDF/LaTeX output with editable-block contracts, citation grounding, evidence checking, and pre-submission quality gates.

2026-05-29436

paper-writing-workflow.md

from "aristoteleo/PantheonOS"

Workflow phases for paper-writing tasks: triage, material inventory, research question, literature review, paper outline, data analysis summary, figure storyline, reader testing, and finalize packet. Each phase is a short contract — read the relevant rows for the current task only.

2026-05-29436

evidence-bound-section-drafting.md

from "aristoteleo/PantheonOS"

Section-level writing skill for evidence-bound academic prose: IMRaD papers, grants, reports, talks, and response letters. Indexes section-specific templates (abstract, introduction, method, results, discussion) and pre-submission quality protocols (claim-evidence, reviewer rubric, response letter).

2026-05-29436

liveview-skills-index.md

from "aristoteleo/PantheonOS"

Skills for opening and driving agent-controllable visualization components in the Pantheon UI sidebar — interactive viewers the agent can open, control, and read back. Viewers: Vitessce (spatial / single- cell omics), Viv (bioimage / microscopy), plus agent-generated apps.

2026-05-20436

package.json

"author": "aristoteleo"

"repository": "aristoteleo/PantheonOS"

Abrir repositório GitHub Ver repositórios do creator

$ install --global

$ download --local

Executar no Manus

$ useful --forSOC

Cientistas de dadosInformática e Matemática15-2051L4

id	database_access_index
name	Database Access Skills Index
description	Skills for querying and downloading data from genomic, transcriptomic, 3D-genome, and cancer-genomics databases. Covers programmatic access to public repositories, gene annotation, sequence retrieval, processed functional-genomics tracks, Hi-C / Micro-C contact matrices, TCGA-style cohorts, and large-scale single-cell data.

Database Access Skills

Quick map — which skill for what

If you need...	Use
Gene / protein / variant metadata (Ensembl, UniProt, NCBI, …)	gget
Raw sequencing reads (FASTQ) from SRA/ENA/GEO/DDBJ/GSA	iSeq
Large-scale single-cell RNA-seq matrices	CELLxGENE Census
Processed functional-genomics tracks (ChIP/ATAC/DNase/RNA-seq bigWig/BAM/peaks)	ENCODE
3D-genome contact matrices (Hi-C / Micro-C / ChIA-PET .mcool / .hic)	4DN
liftOver, UCSC tracks, sequence pulls, large genome catalog	UCSC
Cancer-cohort RNA-seq counts, MAF mutations, CNV, methylation (TCGA / CPTAC)	GDC

Available Skills

gget — Genomic Database Querying

Skill file: gget.md

When to use:

Fetching reference genome/annotation download links (Ensembl)
Searching genes by keyword or retrieving gene metadata
Running BLAST/DIAMOND sequence alignment
Performing enrichment analysis (GO, KEGG, pathway)
Querying cancer mutations (COSMIC) or drug-target associations (OpenTargets)
Retrieving single-cell data from CZ CELLxGENE Discover
Looking up protein structures (PDB, AlphaFold)
Finding tissue expression patterns (ARCHS4, Bgee)
Plotting cancer genomics heatmaps (cBioPortal)

iSeq — Sequencing Data Download

Skill file: iseq.md

When to use:

Downloading raw sequencing data (FASTQ/SRA) from public repositories
Fetching metadata for projects, experiments, or runs
Downloading from Chinese GSA database (CRA/CRR accessions)
Batch downloading multiple accessions from a file
Converting SRA files to FASTQ format
Merging FASTQ files by experiment, sample, or study

CZ CELLxGENE Census — Single-Cell RNA-seq Data Access

Skill file: cellxgene_census.md

When to use:

Querying large-scale single-cell RNA-seq data by tissue, cell type, disease
Downloading count matrices as AnnData objects with metadata filters
Accessing pre-computed embeddings (scVI, Geneformer)
Finding which datasets contain specific genes or cell types
Working with larger-than-memory single-cell datasets via streaming
Exploring CZ CELLxGENE Discover catalog programmatically

ENCODE — Functional Genomics Tracks (ChIP/ATAC/DNase/RNA-seq)

Skill file: encode.md

When to use:

"Find CTCF ChIP-seq peaks in K562 from ENCODE" — TF / histone / chromatin data by target + biosample
ENCODE bigWig signal → IGV track for a paper-quality view
Cross-cell-line / cross-tissue comparison of a regulatory assay
Anywhere you need ENCODE's processed outputs, not raw FASTQ

4DN — 3D Genome Data (Hi-C, Micro-C, ChIA-PET)

Skill file: fourdn.md

When to use:

3D-genome contact map for a cell line / condition
.mcool for use with Cooler / HiGlass / Gosling
ChIA-PET loop calls for a TF
Comparative Hi-C across replicates / conditions

UCSC — Genome Browser API + liftOver

Skill file: ucsc.md

When to use:

liftOver a BED / VCF / single coordinate between assemblies
Find a public Track Hub for a non-model organism
One-off sequence / track pulls without writing pysam boilerplate
Browse UCSC's assembly catalog (assemblies Ensembl doesn't have)

GDC — NCI Genomic Data Commons (TCGA + friends)

Skill file: gdc.md

When to use:

TCGA RNA-seq counts for a project — differential expression input
MAF mutation file for an entire TCGA cohort
CNV / methylation / clinical metadata from TCGA / CPTAC / TARGET
Token-required BAM / gVCF download

Using Skills

Identify your goal: use the quick-map table above to pick the right portal (metadata vs raw reads vs processed tracks vs 3D-genome vs cancer cohort vs single-cell vs coordinate utilities).
Load skill file: Read the full skill document for detailed guidance on its API, filter patterns, and viewer wiring.
Follow examples: each skill's recipes section is copy-paste-able.
Combine tools: they're orthogonal by design — e.g. gget to look up a gene's coordinates → ENCODE for ChIP-seq tracks at that locus → IGV LiveView to display; or GDC for TCGA RNA-seq counts → gget for gene-set enrichment on the result.
Viewer wiring: every new skill includes a "Wire it to a viewer" section showing how to pipe its files into the igv / gosling / molstar LiveViews (download → serve_local_data → viewer track).

database-access-skills-index

Database Access Skills

Quick map — which skill for what

Available Skills

gget — Genomic Database Querying

iSeq — Sequencing Data Download

CZ CELLxGENE Census — Single-Cell RNA-seq Data Access

ENCODE — Functional Genomics Tracks (ChIP/ATAC/DNase/RNA-seq)

4DN — 3D Genome Data (Hi-C, Micro-C, ChIA-PET)

UCSC — Genome Browser API + liftOver

GDC — NCI Genomic Data Commons (TCGA + friends)

Using Skills

Mais deste repositório

Mais deste repositório

Database Access Skills

Quick map — which skill for what

Available Skills

gget — Genomic Database Querying

iSeq — Sequencing Data Download

CZ CELLxGENE Census — Single-Cell RNA-seq Data Access

ENCODE — Functional Genomics Tracks (ChIP/ATAC/DNase/RNA-seq)

4DN — 3D Genome Data (Hi-C, Micro-C, ChIA-PET)

UCSC — Genome Browser API + liftOver

GDC — NCI Genomic Data Commons (TCGA + friends)

Using Skills