تشغيل أي مهارة في Manus بنقرة واحدة

$pwd:

encode-ccres-database

Name: Encode Ccres Database
Author: google-deepmind

// Query the ENCODE Registry of cis-Regulatory Elements (cCREs) via the SCREEN GraphQL API, or make custom queries to the ENCODE Portal REST API for experiments and files (ChIP-seq peaks, etc.). Use when you want to query regulatory annotations or raw experimental data across human cell types.

تشغيل في Manus

$ git log --oneline --stat

stars:٥٢٧

forks:٥٧

updated:٢٨ مايو ٢٠٢٦ في ١٦:٣٥

مستكشف الملفات

5 ملفات

SKILL.md

readonly

name	encode-ccres-database
description	Query the ENCODE Registry of cis-Regulatory Elements (cCREs) via the SCREEN GraphQL API, or make custom queries to the ENCODE Portal REST API for experiments and files (ChIP-seq peaks, etc.). Use when you want to query regulatory annotations or raw experimental data across human cell types.

ENCODE Database Skill

This skill allows you to query the ENCODE Registry of cCREs (candidate cis-Regulatory Elements) via the SCREEN GraphQL API. It helps identify functional non-coding DNA elements (like Promoters, Enhancers, and insulators) by analyzing biochemical signatures (DNase, H3K4me3, H3K27ac, CTCF).

Prerequisites

uv: Read the uv skill and follow its Setup instructions to ensure uv is installed and on PATH.
User Notification: If LICENSE_NOTIFICATION.txt does not already exist in this skill directory then (1) prominently notify the user to check the terms at https://www.encodeproject.org/help/rest-api/, then (2) create the file recording the notification text and timestamp.

Core Rules

Use the Wrapper: ALWAYS execute the provided helper scripts to query the database rather than accessing the database directly. The scripts automatically enforce the required rate limit gracefully.
Parsing Output: Do NOT use cat to read the entire JSON output file into context, as it can be extremely large. You MUST use jq to efficiently parse and extract relevant fields.
Notification: If this skill is used, ensure this is mentioned in the output.

Quick Start

# Search cCREs by coordinates
uv run scripts/screen_api.py search --chromosome chr11 \
  --start 5205263 --end 5207263 \
  --output /tmp/search.json

# Get details for a specific cCRE
uv run scripts/screen_api.py details EH38E2941922 \
  --output /tmp/details.json

All subcommands write JSON to disk. Always save output in a temporary location like /tmp/.

Identifying High-Confidence ("Type A") Biosamples

Biosamples in ENCODE are often categorized by their data completeness. "Type A" (or high-confidence) biosamples are those that have experimental data for all four core epigenetic markers: DNase, H3K4me3, H3K27ac, and CTCF.

The biosamples and details commands automatically enrich their output with an is_type_a boolean flag for each biosample.

Example: Finding high-confidence cell types

uv run scripts/screen_api.py biosamples --output /tmp/biosamples.json
# Use jq to filter for Type A biosamples
jq '.data.ccREBiosampleQuery.biosamples[] | select(.is_type_a == true) | .displayname' /tmp/biosamples.json

Parsing Output (CRITICAL)

Do NOT use cat to read the entire JSON output file into context, as it can be extremely large. Instead, you MUST use jq to efficiently parse and extract the relevant fields from the JSON file saved by the script. If jq is not available on the system, write your own Python filtering code (e.g., python3 -c "import json...") to extract the necessary data.

For a complete reference of the JSON structure returned by eachmcommand (so you know which fields to query with jq), read references/json_output_structure.md.

Available Commands

search: Search cCREs by coordinates, accessions, or epigenetic signals.

uv run scripts/screen_api.py search \
    --chromosome chr11 --start 5205263 --end 5207263 \
    --output /tmp/search.json

nearby-genes: Find nearby genes for given cCRE accessions.

uv run scripts/screen_api.py nearby-genes \
    EH38E1516972 --output /tmp/nearby.json

details: Get detailed information and biosample-specific max Z-scores for a specific cCRE.
```
uv run scripts/screen_api.py details EH38E2941922 \
    --output /tmp/details.json
```

biosamples: Get biosample metadata for an assembly.

uv run scripts/screen_api.py biosamples \
    --output /tmp/biosamples.json

orthologs: Get orthologous cCREs in another assembly.

uv run scripts/screen_api.py orthologs EH38E2941922 \
    --output /tmp/orthologs.json

linked-genes: Find linked genes via methods like HiC or eQTLs.

uv run scripts/screen_api.py linked-genes \
    EH38E1516972 --output /tmp/linked.json

gene-expression: Get gene expression (TPM) across all biosamples for a named gene. Internally resolves the gene symbol to an Ensembl gene ID, then queries per-biosample RNA-seq quantifications.
```
uv run scripts/screen_api.py gene-expression GAPDH \
    --output /tmp/gene_expr.json
```

entex: Get ENTEx data for a cCRE or genomic region.

uv run scripts/screen_api.py entex \
    --accession EH38E1310345 \
    --output /tmp/entex.json

uv run scripts/screen_api.py entex \
    --region chr1:1000068:1000409 \
    --output /tmp/entex.json

gwas: Query genome-wide association studies, SNPs, or enrichment data.

uv run scripts/screen_api.py gwas studies \
    --output /tmp/gwas.json

uv run scripts/screen_api.py gwas snps --study \
    Ahola-Olli_AV-27989323-Eotaxin_levels \
    --output /tmp/gwas_snps.json

You can supply the --assembly mm10 or --assembly grch38 flag to explicitly request a specific assembly for most commands. By default, the script targets grch38 but will automatically fall back to mm10 if no results are found or if the query fails.

ENCODE Portal REST API (Direct Access)

For accessing raw experiments, ChIP-seq peaks, or other datasets that are not represented as cCREs in SCREEN, use the scripts/encode_portal_api.py script. It allows custom queries to the ENCODE Portal REST API.

Usage

uv run scripts/encode_portal_api.py search "type=Experiment&target.label=ZNF549" --output /tmp/znf549_experiments.json

Data Analysis Tips

When analyzing .bed or .bigBed files downloaded from ENCODE, standard bioinformatics tools are highly recommended for finding overlaps (e.g., between gene promoters and peaks):

bedtools: For fast mathematical operations on genomic intervals.
bigBedToBed: For converting binary BigBed files to readable BED format.
pybedtools: A Python wrapper for bedtools.

Write custom logic if these tools are not pre-installed.

Custom Queries (SCREEN GraphQL)

If you need to make a complex GraphQL query that the script does not support, read references/graphql_schema.md for a reference of available queries, arguments, and return fields in the SCREEN GraphQL API.

related-skills.json

نفس المستودع

alphafold-database-fetch-and-analyze.md

from "google-deepmind/science-skills"

Retrieve and analyze AlphaFold predicted structures for a protein. Use when the user provides a specific UniProt Accession ID and wants structural confidence metrics (pLDDT), domain boundary analysis, or disorder assessment. Do not use if the user only has a protein name, gene name, or amino acid sequence — ask for a UniProt ID first.

2026-05-28527

chembl-database.md

from "google-deepmind/science-skills"

Query the ChEMBL database for bioactive molecules, drug targets, bioactivity data, approved drugs, and chemical structures. Use when the user asks about compounds, targets, IC50/Ki values, drug mechanisms, or structure searches.

2026-05-28527

clinical-trials-database.md

from "google-deepmind/science-skills"

Query ClinicalTrials.gov via APIv2. Use when you want to search for trials by condition, drug, location, status, or phase; retrieve trial details by NCT ID; check eligibility/inclusion criteria; count trials across conditions or time periods; identify a sponsor's trial portfolio; find recruiting trials for patient matching.

2026-05-28527

clinvar-database.md

from "google-deepmind/science-skills"

Use when needing clinical significance, pathogenicity classifications (e.g., Pathogenic, Benign, VUS), clinical evidence rationales, or finding "hard positive" benchmark controls for human genomic variants.

2026-05-28527

dbsnp-database.md

from "google-deepmind/science-skills"

Use when you want to look up, map, and search for short genetic variants (SNPs, indels) in NCBI's dbSNP database. Resolves between rsIDs, genomic coordinates in VCF format, and HGVS strings. For an rsID, returns variant type, gene associations, clinical significance, allele frequencies, and genomic coordinates (GRCh38).

2026-05-28527

embl-ebi-ols.md

from "google-deepmind/science-skills"

Query and search the EMBL-EBI Ontology Lookup Service (OLS) for biomedical ontology terms, definitions, and hierarchies across 250+ ontologies (e.g., GO, DOID, HP). Use when the user asks to search for terms, retrieve details, navigate hierarchies (parents, children, ancestors), look up properties and individuals, get autocomplete suggestions, or access ontology metadata and statistics.

2026-05-28527

package.json

"author": "google-deepmind"

"repository": "google-deepmind/science-skills"

فتح مستودع GitHub عرض مستودعات المنشئ

$ install --global

$ download --local

تشغيل في Manus

$ useful --forSOC

علماء الأحياء، جميع الآخرونعلوم الحياة والطبيعة والاجتماع19-1029L4

name	encode-ccres-database
description	Query the ENCODE Registry of cis-Regulatory Elements (cCREs) via the SCREEN GraphQL API, or make custom queries to the ENCODE Portal REST API for experiments and files (ChIP-seq peaks, etc.). Use when you want to query regulatory annotations or raw experimental data across human cell types.

ENCODE Database Skill

Prerequisites

uv: Read the uv skill and follow its Setup instructions to ensure uv is installed and on PATH.
User Notification: If LICENSE_NOTIFICATION.txt does not already exist in this skill directory then (1) prominently notify the user to check the terms at https://www.encodeproject.org/help/rest-api/, then (2) create the file recording the notification text and timestamp.

Core Rules

Use the Wrapper: ALWAYS execute the provided helper scripts to query the database rather than accessing the database directly. The scripts automatically enforce the required rate limit gracefully.
Parsing Output: Do NOT use cat to read the entire JSON output file into context, as it can be extremely large. You MUST use jq to efficiently parse and extract relevant fields.
Notification: If this skill is used, ensure this is mentioned in the output.

Quick Start

# Search cCREs by coordinates
uv run scripts/screen_api.py search --chromosome chr11 \
  --start 5205263 --end 5207263 \
  --output /tmp/search.json

# Get details for a specific cCRE
uv run scripts/screen_api.py details EH38E2941922 \
  --output /tmp/details.json

All subcommands write JSON to disk. Always save output in a temporary location like /tmp/.

Identifying High-Confidence ("Type A") Biosamples

The biosamples and details commands automatically enrich their output with an is_type_a boolean flag for each biosample.

Example: Finding high-confidence cell types

uv run scripts/screen_api.py biosamples --output /tmp/biosamples.json
# Use jq to filter for Type A biosamples
jq '.data.ccREBiosampleQuery.biosamples[] | select(.is_type_a == true) | .displayname' /tmp/biosamples.json

Parsing Output (CRITICAL)

For a complete reference of the JSON structure returned by eachmcommand (so you know which fields to query with jq), read references/json_output_structure.md.

Available Commands

search: Search cCREs by coordinates, accessions, or epigenetic signals.

uv run scripts/screen_api.py search \
    --chromosome chr11 --start 5205263 --end 5207263 \
    --output /tmp/search.json

nearby-genes: Find nearby genes for given cCRE accessions.

uv run scripts/screen_api.py nearby-genes \
    EH38E1516972 --output /tmp/nearby.json

details: Get detailed information and biosample-specific max Z-scores for a specific cCRE.
```
uv run scripts/screen_api.py details EH38E2941922 \
    --output /tmp/details.json
```

biosamples: Get biosample metadata for an assembly.

uv run scripts/screen_api.py biosamples \
    --output /tmp/biosamples.json

orthologs: Get orthologous cCREs in another assembly.

uv run scripts/screen_api.py orthologs EH38E2941922 \
    --output /tmp/orthologs.json

linked-genes: Find linked genes via methods like HiC or eQTLs.

uv run scripts/screen_api.py linked-genes \
    EH38E1516972 --output /tmp/linked.json

gene-expression: Get gene expression (TPM) across all biosamples for a named gene. Internally resolves the gene symbol to an Ensembl gene ID, then queries per-biosample RNA-seq quantifications.
```
uv run scripts/screen_api.py gene-expression GAPDH \
    --output /tmp/gene_expr.json
```

entex: Get ENTEx data for a cCRE or genomic region.

uv run scripts/screen_api.py entex \
    --accession EH38E1310345 \
    --output /tmp/entex.json

uv run scripts/screen_api.py entex \
    --region chr1:1000068:1000409 \
    --output /tmp/entex.json

gwas: Query genome-wide association studies, SNPs, or enrichment data.

uv run scripts/screen_api.py gwas studies \
    --output /tmp/gwas.json

uv run scripts/screen_api.py gwas snps --study \
    Ahola-Olli_AV-27989323-Eotaxin_levels \
    --output /tmp/gwas_snps.json

ENCODE Portal REST API (Direct Access)

Usage

uv run scripts/encode_portal_api.py search "type=Experiment&target.label=ZNF549" --output /tmp/znf549_experiments.json

Data Analysis Tips

When analyzing .bed or .bigBed files downloaded from ENCODE, standard bioinformatics tools are highly recommended for finding overlaps (e.g., between gene promoters and peaks):

bedtools: For fast mathematical operations on genomic intervals.
bigBedToBed: For converting binary BigBed files to readable BED format.
pybedtools: A Python wrapper for bedtools.

Write custom logic if these tools are not pre-installed.

encode-ccres-database

ENCODE Database Skill

Prerequisites

Core Rules

Quick Start

Identifying High-Confidence ("Type A") Biosamples

Parsing Output (CRITICAL)

Available Commands

ENCODE Portal REST API (Direct Access)

Usage

Data Analysis Tips

Custom Queries (SCREEN GraphQL)

المزيد من هذا المستودع

المزيد من هذا المستودع

ENCODE Database Skill

Prerequisites

Core Rules

Quick Start

Identifying High-Confidence ("Type A") Biosamples

Parsing Output (CRITICAL)

Available Commands

ENCODE Portal REST API (Direct Access)

Usage

Data Analysis Tips

Custom Queries (SCREEN GraphQL)