with one click
kegg-database
// Direct REST API access to KEGG (academic use only). Pathway analysis, gene-pathway mapping, metabolic pathways, drug interactions, ID conversion. Use this for direct HTTP/REST work or KEGG-specific control.
// Direct REST API access to KEGG (academic use only). Pathway analysis, gene-pathway mapping, metabolic pathways, drug interactions, ID conversion. Use this for direct HTTP/REST work or KEGG-specific control.
Access AlphaFold 200M+ AI-predicted protein structures. Retrieve structures by UniProt ID, download PDB/mmCIF files, analyze confidence metrics (pLDDT, PAE), for drug discovery and structural biology.
Comprehensive molecular biology toolkit. Use for sequence manipulation, file parsing (FASTA/GenBank/PDB), phylogenetics, and programmatic NCBI/PubMed access (Bio.Entrez). Best for batch processing, custom bioinformatics pipelines, BLAST automation.
Access BRENDA enzyme database via SOAP API. Retrieve kinetic parameters (Km, kcat), reaction equations, organism data, and substrate-specific enzyme information for biochemical research and metabolic pathway analysis.
Query ChEMBL bioactive molecules and drug discovery data. Search compounds by structure/properties, retrieve bioactivity data (IC50, Ki), find inhibitors, perform SAR studies, for medicinal chemistry.
Query openFDA API for drugs, devices, adverse events, recalls, regulatory submissions (510k, PMA), substance identification (UNII), for FDA regulatory data analysis and safety research.
Low-level plotting library for full customization. Use when you need fine-grained control over every plot element, creating novel plot types, or integrating with specific scientific workflows. Export to PNG/PDF/SVG for publication. For quick statistical plots use seaborn; for interactive plots use plotly; for publication-ready multi-panel figures with journal styling, use scientific-visualization.
| name | kegg_database |
| description | Direct REST API access to KEGG (academic use only). Pathway analysis, gene-pathway mapping, metabolic pathways, drug interactions, ID conversion. Use this for direct HTTP/REST work or KEGG-specific control. |
| license | Non-academic use of KEGG requires a commercial license |
| metadata | {"skill-author":"VenusFactory2."} |
KEGG (Kyoto Encyclopedia of Genes and Genomes) is a comprehensive bioinformatics resource for biological pathway analysis and molecular interaction networks. In this project the agent exposes only download tools: save database info, entry lists, search results, entry data, ID conversions, cross-references, and drug-drug interactions to files; each returns rich JSON {status, file_info, content_preview, biological_metadata, execution_context}. For programmatic use, the package also provides query-style APIs (see Project Modules).
Important: KEGG API is made available only for academic use by academic users.
This skill should be used when querying pathways, genes, compounds, enzymes, diseases, and drugs across multiple organisms using KEGG's REST API.
The skill provides:
src/tools/database/kegg/: kegg_rest.py (base HTTP client), kegg_operations.py (query/download operations), kegg_api.py (backward-compat re-exports); all download functions re-exported via package. For programmatic use, import e.g. from src.tools.database.kegg import download_kegg_entry_by_id, ....references/kegg_reference.md| Tool name | Arguments | Purpose |
|---|---|---|
download_kegg_info_by_database | database, out_path | Download KEGG database info/statistics to file |
download_kegg_list_by_database | database, out_path, org_or_ids (optional) | Download KEGG entry list by database to file |
download_kegg_find_by_database | database, query, out_path, option (optional) | Download KEGG search results to file |
download_kegg_entry_by_id | entry_id, out_path, format (optional) | Download KEGG entry data by entry ID to file |
download_kegg_conv_by_id | target_db, source_id, out_path | Download KEGG ID conversion result to file |
download_kegg_link_by_id | target_db, source_id, out_path | Download KEGG cross-reference links to file |
download_kegg_ddi_by_id | drug_id, out_path | Download KEGG drug-drug interaction data to file |
All return rich JSON: {status, file_info, content_preview, biological_metadata, execution_context}. Academic use only.
| Capability | Function | Module | Purpose |
|---|---|---|---|
| HTTP client | kegg_request(operation, *path_parts) | kegg_rest.py | Base REST GET request, returns text |
| ID helper | _join_ids(entry_id) | kegg_rest.py | Format one or multiple IDs for URL (max 10) |
| Query: info | query_kegg_info_by_database(database) | kegg_operations.py | Returns rich JSON in memory |
| Query: list | query_kegg_list_by_database(database, org_or_ids) | kegg_operations.py | Returns rich JSON in memory |
| Query: find | query_kegg_find_by_database(database, query, option) | kegg_operations.py | Returns rich JSON in memory |
| Query: entry | query_kegg_entry_by_id(entry_id, format) | kegg_operations.py | Returns rich JSON in memory |
| Query: conv | query_kegg_conv_by_id(target_db, source_id) | kegg_operations.py | Returns rich JSON in memory |
| Query: link | query_kegg_link_by_id(target_db, source_id) | kegg_operations.py | Returns rich JSON in memory |
| Query: ddi | query_kegg_ddi_by_id(drug_id) | kegg_operations.py | Returns rich JSON in memory |
| Download: info | download_kegg_info_by_database(database, out_path) | kegg_operations.py | Save to file, return rich JSON |
| Download: list | download_kegg_list_by_database(database, out_path, org_or_ids) | kegg_operations.py | Save to file, return rich JSON |
| Download: find | download_kegg_find_by_database(database, query, out_path, option) | kegg_operations.py | Save to file, return rich JSON |
| Download: entry | download_kegg_entry_by_id(entry_id, out_path, format) | kegg_operations.py | Save to file, return rich JSON |
| Download: conv | download_kegg_conv_by_id(target_db, source_id, out_path) | kegg_operations.py | Save to file, return rich JSON |
| Download: link | download_kegg_link_by_id(target_db, source_id, out_path) | kegg_operations.py | Save to file, return rich JSON |
| Download: ddi | download_kegg_ddi_by_id(drug_id, out_path) | kegg_operations.py | Save to file, return rich JSON |
| Compat alias | kegg_info, kegg_list, kegg_find, kegg_get, kegg_conv, kegg_link, kegg_ddi | kegg_operations.py | Backward-compat aliases for query functions |
Download database info:
from src.tools.database.kegg import download_kegg_info_by_database
result = download_kegg_info_by_database("pathway", "output/kegg_info_pathway.txt")
Query in-memory:
from src.tools.database.kegg.kegg_operations import query_kegg_info_by_database
result = query_kegg_info_by_database("pathway")
Common databases: kegg, pathway, module, brite, genes, genome, compound, glycan, reaction, enzyme, disease, drug
Download entry list:
from src.tools.database.kegg import download_kegg_list_by_database
# List all reference pathways
result = download_kegg_list_by_database("pathway", "output/kegg_pathways.txt")
# List human-specific pathways
result = download_kegg_list_by_database("pathway", "output/kegg_hsa_pathways.txt", org_or_ids="hsa")
Common organism codes: hsa (human), mmu (mouse), dme (fruit fly), sce (yeast), eco (E. coli)
Download search results:
from src.tools.database.kegg import download_kegg_find_by_database
# Keyword search
result = download_kegg_find_by_database("genes", "p53", "output/kegg_find_p53.txt")
# Chemical formula search (exact match)
result = download_kegg_find_by_database("compound", "C7H10N4O2", "output/kegg_find_formula.txt", option="formula")
# Molecular weight range search
result = download_kegg_find_by_database("drug", "300-310", "output/kegg_find_mass.txt", option="exact_mass")
Search options: formula (exact match), exact_mass (range), mol_weight (range)
Download entry data:
from src.tools.database.kegg import download_kegg_entry_by_id
# Get pathway entry
result = download_kegg_entry_by_id("hsa00010", "output/kegg_glycolysis.txt")
# Get protein sequence (FASTA)
result = download_kegg_entry_by_id("hsa:10458", "output/kegg_gene_aaseq.fasta", format="aaseq")
# Get compound structure
result = download_kegg_entry_by_id("cpd:C00002", "output/kegg_atp.mol", format="mol")
# Get pathway as JSON (single entry only)
result = download_kegg_entry_by_id("hsa05130", "output/kegg_pathway.json", format="json")
Output formats: aaseq (protein FASTA), ntseq (nucleotide FASTA), mol (MOL format), kcf (KCF format), image (PNG), kgml (XML), json (pathway JSON)
Important: Image, KGML, and JSON formats allow only one entry at a time.
Download ID conversion:
from src.tools.database.kegg import download_kegg_conv_by_id
# Convert KEGG gene to NCBI Gene ID
result = download_kegg_conv_by_id("ncbi-geneid", "hsa:10458", "output/kegg_conv.txt")
# Convert to UniProt
result = download_kegg_conv_by_id("uniprot", "hsa:10458", "output/kegg_conv_uniprot.txt")
Supported conversions: ncbi-geneid, ncbi-proteinid, uniprot, pubchem, chebi
Download cross-references:
from src.tools.database.kegg import download_kegg_link_by_id
# Get genes in a specific pathway
result = download_kegg_link_by_id("genes", "hsa00010", "output/kegg_link_glycolysis.txt")
# Find pathways containing a specific gene
result = download_kegg_link_by_id("pathway", "hsa:10458", "output/kegg_link_gene.txt")
# Find compounds in a pathway
result = download_kegg_link_by_id("compound", "hsa00010", "output/kegg_link_compound.txt")
Common links: genes ↔ pathway, pathway ↔ compound, pathway ↔ enzyme, genes ↔ ko (orthology)
Download DDI data:
from src.tools.database.kegg import download_kegg_ddi_by_id
result = download_kegg_ddi_by_id("D00001", "output/kegg_ddi.txt")
from src.tools.database.kegg import (
download_kegg_find_by_database,
download_kegg_link_by_id,
download_kegg_entry_by_id,
)
# Step 1: Find gene by name
download_kegg_find_by_database("genes", "p53", "output/kegg_p53_genes.txt")
# Step 2: Link gene to pathways
download_kegg_link_by_id("pathway", "hsa:7157", "output/kegg_p53_pathways.txt")
# Step 3: Get pathway details
download_kegg_entry_by_id("hsa05200", "output/kegg_cancer_pathway.txt")
from src.tools.database.kegg import (
download_kegg_find_by_database,
download_kegg_link_by_id,
download_kegg_entry_by_id,
)
# Step 1: Search for compound
download_kegg_find_by_database("compound", "glucose", "output/kegg_glucose.txt")
# Step 2: Link compound to reactions/pathways
download_kegg_link_by_id("reaction", "cpd:C00031", "output/kegg_glucose_reactions.txt")
download_kegg_link_by_id("pathway", "rn:R00299", "output/kegg_reaction_pathways.txt")
# Step 3: Get pathway details
download_kegg_entry_by_id("map00010", "output/kegg_glycolysis.txt")
from src.tools.database.kegg import (
download_kegg_conv_by_id,
download_kegg_entry_by_id,
)
# Step 1: Convert KEGG gene IDs to external database IDs
download_kegg_conv_by_id("uniprot", "hsa:10458", "output/kegg_to_uniprot.txt")
download_kegg_conv_by_id("ncbi-geneid", "hsa:10458", "output/kegg_to_ncbi.txt")
# Step 2: Get sequences using KEGG
download_kegg_entry_by_id("hsa:10458", "output/kegg_gene_seq.fasta", format="aaseq")
from src.tools.database.kegg.kegg_operations import (
query_kegg_list_by_database,
query_kegg_entry_by_id,
)
# List pathways for multiple organisms
human_pathways = query_kegg_list_by_database("pathway", "hsa")
mouse_pathways = query_kegg_list_by_database("pathway", "mmu")
# Get organism-specific pathway details
hsa_glycolysis = query_kegg_entry_by_id("hsa00010")
mmu_glycolysis = query_kegg_entry_by_id("mmu00010")
{
"status": "success",
"file_info": {
"file_path": "/absolute/path/to/file.txt",
"file_name": "file.txt",
"file_size": 12345,
"format": "txt"
},
"content_preview": "first 500 chars...",
"biological_metadata": {"database": "pathway"},
"execution_context": {"download_time_ms": 234, "source": "KEGG"}
}
{
"status": "success",
"content": "{...full JSON...}",
"content_preview": "first 500 chars...",
"biological_metadata": {"database": "pathway"},
"execution_context": {"query_time_ms": 123, "source": "KEGG"}
}
{
"status": "error",
"error": {"type": "QueryError", "message": "...", "suggestion": "..."},
"file_info": null
}
KEGG organizes pathways into seven major categories:
map00010 - Glycolysis, map00190 - Oxidative phosphorylation)map03010 - Ribosome, map03040 - Spliceosome)map04010 - MAPK signaling, map02010 - ABC transporters)map04140 - Autophagy, map04210 - Apoptosis)map04610 - Complement cascade, map04910 - Insulin signaling)map05200 - Pathways in cancer, map05010 - Alzheimer disease)Reference references/kegg_reference.md for detailed pathway lists and classifications.
| Type | Format | Example |
|---|---|---|
| Pathway (reference) | map##### | map00010 |
| Pathway (human) | hsa##### | hsa00010 |
| Gene | organism:gene_number | hsa:10458 |
| Compound | cpd:C##### | cpd:C00002 (ATP) |
| Drug | dr:D##### | dr:D00001 |
| Enzyme | ec:EC_number | ec:1.1.1.1 |
| KO (Orthology) | ko:K##### | ko:K00001 |
Scripts live in src/tools/database/kegg/. Import from package: from src.tools.database.kegg import ...
Central operations module providing both query and download functions:
query_kegg_info_by_database(database) — returns rich JSON in memoryquery_kegg_list_by_database(database, org_or_ids) — returns rich JSON in memoryquery_kegg_find_by_database(database, query, option) — returns rich JSON in memoryquery_kegg_entry_by_id(entry_id, format) — returns rich JSON in memoryquery_kegg_conv_by_id(target_db, source_id) — returns rich JSON in memoryquery_kegg_link_by_id(target_db, source_id) — returns rich JSON in memoryquery_kegg_ddi_by_id(drug_id) — returns rich JSON in memorydownload_kegg_info_by_database(database, out_path) — save to file, return rich JSONdownload_kegg_list_by_database(database, out_path, org_or_ids) — save to file, return rich JSONdownload_kegg_find_by_database(database, query, out_path, option) — save to file, return rich JSONdownload_kegg_entry_by_id(entry_id, out_path, format) — save to file, return rich JSONdownload_kegg_conv_by_id(target_db, source_id, out_path) — save to file, return rich JSONdownload_kegg_link_by_id(target_db, source_id, out_path) — save to file, return rich JSONdownload_kegg_ddi_by_id(drug_id, out_path) — save to file, return rich JSONBackward-compat aliases: kegg_info, kegg_list, kegg_find, kegg_get, kegg_conv, kegg_link, kegg_ddi.
Test: bash script/tools/database/test_kegg.sh — runs kegg_operations.py --test, outputs under example/database/kegg/.
Base HTTP client:
kegg_request(operation, *path_parts) — GET request to https://rest.kegg.jp/, returns response text_join_ids(entry_id) — format one or multiple entry IDs for URL (max 10, + separated)Backward-compat entry point: re-exports all query/download functions and legacy aliases.
404 Not Found: Entry or database doesn't exist; verify IDs and organism codes 400 Bad Request: Syntax error in API call; check parameter formatting Empty results: Search term may not match entries; try broader keywords Image/KGML errors: These formats only work with single entries; remove batch processing
Comprehensive API documentation including complete database list, operation syntax, all organism codes, HTTP status codes, and integration with Biopython/R.