بنقرة واحدة
pdbe-api
// Query the PDBe (Protein Data Bank in Europe) REST API and Solr search API from within Coot to access structure metadata, validation data, revision history, search capabilities, and download coordinate files
// Query the PDBe (Protein Data Bank in Europe) REST API and Solr search API from within Coot to access structure metadata, validation data, revision history, search capabilities, and download coordinate files
API documentation to be loaded at startup - when starting a Coot session, immediately call get_function_descriptions() with the functions listed in this skill.
Best Practices for using Coot MCP
Using Density-Fit Correlations in Coot
Best Practices for Model-Building Tools and Refinement
Best practices for protein structure refinement and validation in Coot. Use when performing (1) Residue refinement operations, (2) Model building and fitting, (3) Rotamer fixing, (4) Scripted/automated refinement workflows, (5) Validation and correlation checking.
Comprehensive structure validation combining model-to-map analysis and unmodeled density detection
| name | pdbe-api |
| description | Query the PDBe (Protein Data Bank in Europe) REST API and Solr search API from within Coot to access structure metadata, validation data, revision history, search capabilities, and download coordinate files |
The PDBe (Protein Data Bank in Europe) provides comprehensive REST and Solr-based APIs for programmatic access to structure data, validation reports, compound information, revision history, and search capabilities. Coot can access these APIs directly using the coot_get_url_as_string_py() function, which now supports both text and binary data.
coot.coot_get_url_as_string_py(url) - Fetch URL content
Returns:
str for text/JSON content (valid UTF-8)bytes for binary content (gzipped files, images, etc.)import json
# Example 1: Get structure summary (returns string)
result = coot.coot_get_url_as_string_py("https://www.ebi.ac.uk/pdbe/api/pdb/entry/summary/4wa9")
data = json.loads(result)
# Example 2: Download gzipped coordinates (returns bytes)
import gzip
compressed = coot.coot_get_url_as_string_py("https://files.rcsb.org/download/4wa9.cif.gz")
decompressed = gzip.decompress(compressed)
imol = coot.read_coordinates_as_string(decompressed.decode('utf-8'), "4wa9")
Base URL: https://www.ebi.ac.uk/pdbe/api/
Documentation: https://www.ebi.ac.uk/pdbe/api/doc/
Base URL: https://www.ebi.ac.uk/pdbe/graph-api/
Documentation: https://pdbe.org/graph-api
Base URL: https://www.ebi.ac.uk/pdbe/search/pdb/select?
Documentation: https://www.ebi.ac.uk/pdbe/api/doc/search.html
Get basic information about a structure including deposition date, revision date, authors, and experimental method:
import json
pdb_id = "4wa9"
url = f"https://www.ebi.ac.uk/pdbe/api/pdb/entry/summary/{pdb_id}"
result = coot.coot_get_url_as_string_py(url)
# Handle both string and bytes responses
if isinstance(result, bytes):
result = result.decode('utf-8')
data = json.loads(result)
# Extract key information
entry = data[pdb_id][0]
print(f"Title: {entry['title']}")
print(f"Release date: {entry['release_date']}")
print(f"Revision date: {entry['revision_date']}")
print(f"Method: {entry['experimental_method']}")
print(f"Authors: {entry['entry_authors']}")
Key fields in response:
title - Structure titlerelease_date - Original deposition date (YYYYMMDD)revision_date - Most recent revision date (YYYYMMDD)experimental_method - List of experimental methodsentry_authors - List of authorsnumber_of_entities - Count of different entity types (protein, ligand, water, etc.)Get organism and molecule details:
import json
pdb_id = "4wa9"
url = f"https://www.ebi.ac.uk/pdbe/api/pdb/entry/molecules/{pdb_id}"
result = coot.coot_get_url_as_string_py(url)
if isinstance(result, bytes):
result = result.decode('utf-8')
data = json.loads(result)
if pdb_id in data:
for entity in data[pdb_id]:
molecule_name = entity.get('molecule_name', ['N/A'])[0] if isinstance(entity.get('molecule_name', []), list) else entity.get('molecule_name', 'N/A')
source = entity.get('source', [{}])[0] if isinstance(entity.get('source', []), list) else entity.get('source', {})
organism = source.get('organism_scientific_name', 'N/A')
expression_host = source.get('expression_host_scientific_name', 'N/A')
print(f"Molecule: {molecule_name}")
print(f" Source organism: {organism}")
if expression_host != 'N/A' and expression_host != organism:
print(f" Expression host: {expression_host}")
Get detailed information about a specific compound including formula, SMILES, InChI, and revision history:
import json
comp_id = "AXI" # 3-letter code
url = f"https://www.ebi.ac.uk/pdbe/api/pdb/compound/summary/{comp_id}"
result = coot.coot_get_url_as_string_py(url)
if isinstance(result, bytes):
result = result.decode('utf-8')
data = json.loads(result)
compound = data[comp_id][0]
print(f"Name: {compound['name']}")
print(f"Formula: {compound['formula']}")
print(f"Weight: {compound['weight']}")
print(f"Creation date: {compound['creation_date']}")
print(f"Revision date: {compound['revision_date']}")
print(f"InChI: {compound['inchi']}")
print(f"SMILES: {compound['smiles'][0]['name']}")
Use case: Check if a ligand definition was recently revised, which might explain geometry changes.
Download and load PDB/mmCIF files directly:
import gzip
# Download current version from RCSB
pdb_id = "4wa9"
url = f"https://files.rcsb.org/download/{pdb_id}.cif.gz"
print(f"Downloading {pdb_id}...")
compressed = coot.coot_get_url_as_string_py(url)
print(f"Downloaded {len(compressed)} bytes (compressed)")
# Decompress
decompressed = gzip.decompress(compressed)
print(f"Decompressed to {len(decompressed)} bytes")
# Load into Coot
imol = coot.read_coordinates_as_string(decompressed.decode('utf-8'), f"{pdb_id}")
print(f"Loaded as molecule {imol}")
Note: The wwPDB versioned archive exists but is not currently accessible via HTTPS through this API. Use the current version from RCSB or PDBe.
Get residue-wise outliers including clashes, geometry outliers, and density fit issues:
import json
pdb_id = "4wa9"
url = f"https://www.ebi.ac.uk/pdbe/api/validation/residuewise_outlier_summary/entry/{pdb_id}"
result = coot.coot_get_url_as_string_py(url)
if isinstance(result, bytes):
result = result.decode('utf-8')
# Note: Response may have multiple JSON objects, parse carefully
# Access specific chain/residue validation data
Available validation endpoints:
/validation/residuewise_outlier_summary/entry/{pdb_id} - Residue-level outliers/validation/rama_sidechain_listing/entry/{pdb_id} - Ramachandran and rotamer outliers/validation/global_percentiles/entry/{pdb_id} - Overall quality metricsCheck if a structure has been superseded or revised:
import json
pdb_id = "4wa9"
url = f"https://www.ebi.ac.uk/pdbe/api/pdb/entry/status/{pdb_id}"
result = coot.coot_get_url_as_string_py(url)
if isinstance(result, bytes):
result = result.decode('utf-8')
data = json.loads(result)
status = data[pdb_id][0]
print(f"Status: {status['status_code']}") # REL = released, OBS = obsolete
print(f"Since: {status['since']}")
print(f"Superseded by: {status['superceded_by']}")
print(f"Obsoletes: {status['obsoletes']}")
Get information about ligand binding sites and interactions:
import json
pdb_id = "4wa9"
url = f"https://www.ebi.ac.uk/pdbe/api/pdb/entry/ligand_monomers/{pdb_id}"
result = coot.coot_get_url_as_string_py(url)
if isinstance(result, bytes):
result = result.decode('utf-8')
data = json.loads(result)
# Access ligand information per chain
for entity in data[pdb_id]:
print(f"Chain: {entity['chain_id']}")
for ligand in entity.get('ligands', []):
print(f" Ligand: {ligand['chem_comp_id']}")
print(f" Residue: {ligand['author_residue_number']}")
Get biological assembly information:
import json
pdb_id = "2hyy"
url = f"https://www.ebi.ac.uk/pdbe/api/pdb/entry/assembly/{pdb_id}"
result = coot.coot_get_url_as_string_py(url)
if isinstance(result, bytes):
result = result.decode('utf-8')
data = json.loads(result)
for assembly in data[pdb_id]:
print(f"Assembly {assembly['assembly_id']}: {assembly['name']}")
print(f" Form: {assembly['form']}")
print(f" Preferred: {assembly['preferred']}")
The Solr search API allows complex queries across the entire PDB. However, it has important limitations.
✅ Metadata searches:
release_year:2025experimental_method:"X-ray diffraction"resolution:[* TO 2.0] or resolution:[1.5 TO 2.5]organism_scientific_name:"Homo sapiens"✅ Presence/absence queries:
number_of_protein_chains:[1 TO *]has_carb_polymer:Yhas_bound_molecule:Yhas_modified_residues:Y✅ Component searches:
chem_comp_id:ATPligand_name:imatinibmolecule_name:*kinase*✅ Author/citation:
entry_authors:"Smith J"uniprot_accession:P12345✅ Combined queries:
# Example: Human kinases with resolution < 2Å from 2024
query = 'release_year:2024 AND organism_scientific_name:"Homo sapiens" AND molecule_name:*kinase* AND resolution:[* TO 2.0]'
❌ Detailed connectivity: Cannot search for "THR covalently bonded to NAG" or other specific atom-level connections
❌ Geometry queries: Cannot search for "bonds longer than X" or "angles outside range Y"
❌ Spatial relationships: Cannot search for "atoms within 5Å of ligand"
❌ Sequence motifs: Cannot search for "structures with GXGXXG motif"
❌ Complex structural features: Cannot search for "beta-barrel with 8 strands"
❌ Validation specifics: Cannot search for "residues with Ramachandran outliers at position X"
The Pattern: Solr indexes metadata and simple categorical data, not structural details or relationships.
For analyses requiring connectivity or geometry (like finding O-glycosylated threonines), you must:
import json
# Simple search for high-resolution X-ray structures from 2024
query = "release_year:2024 AND experimental_method:\"X-ray diffraction\" AND resolution:[* TO 1.5]"
url = f"https://www.ebi.ac.uk/pdbe/search/pdb/select?q={query}&wt=json&rows=10&fl=pdb_id,title,resolution"
result = coot.coot_get_url_as_string_py(url)
if isinstance(result, bytes):
result = result.decode('utf-8')
data = json.loads(result)
print(f"Found {data['response']['numFound']} structures")
for doc in data['response']['docs']:
print(f" {doc['pdb_id']}: {doc.get('resolution', 'N/A')}Å")
print(f" {doc.get('title', 'N/A')[:70]}")
Identifiers & Metadata:
pdb_id - PDB entry IDmolecule_name - Molecule namemolecule_type - Entity type: use "Protein" (capital P) — NOT polypeptide(L) or proteinmolecule_sequence - One-letter sequence string (stored but not full-text indexed — wildcard search like *C*C* returns 0 results; fetch and filter in Python instead)polymer_length - Length of the polymer entity in residues (supports range queries: [5 TO 30])number_of_polymer_residues - Total residues across all chains in the entrynumber_of_protein_chains - Number of protein chainsorganism_scientific_name - Source organismexperimental_method - Experimental method (e.g., "X-ray diffraction")resolution - Structure resolutionligand_name - Ligand/compound namecitation_title - Publication titledeposition_date - Deposition daterevision_date - Revision dateExperimental Details:
experimental_method - Method (e.g., "X-ray diffraction", "Electron Microscopy", "Solution NMR")resolution - Structure resolution (numeric, use ranges like [1.0 TO 2.0])em_resolution - EM-specific resolutiondata_quality - Overall quality metricMolecular Content:
molecule_name - Molecule name (supports wildcards: *kinase*)molecule_type - Type (Protein, DNA, RNA, etc.)organism_scientific_name - Source organismorganism_synonyms - Alternative organism namesgenus - Organism genusexpression_host_scientific_name - Expression systemLigands & Modifications:
chem_comp_id - Chemical component 3-letter codeligand_name - Ligand namehas_bound_molecule - Y/Nhas_carb_polymer - Y/N (has carbohydrate)has_modified_residues - Y/Nnumber_of_bound_molecules - CountAuthors & Citations:
entry_authors - Entry authorscitation_authors - Publication authorscitation_title - Paper titlecitation_year - Publication yearpubmed_id - PubMed IDProtein Details:
uniprot_accession - UniProt accessionuniprot_id - UniProt IDgene_name - Gene namego_id - Gene Ontology IDStructure Properties:
number_of_protein_chains - Countnumber_of_polymer_entities - Countassembly_composition - Assembly typesymmetry_group - SymmetryExample 1: High-resolution X-ray structures from 2024
import json
url = "https://www.ebi.ac.uk/pdbe/search/pdb/select?q=release_year:2024%20AND%20experimental_method:\"X-ray%20diffraction\"%20AND%20resolution:[*%20TO%201.5]&wt=json&rows=5&fl=pdb_id,title,resolution"
result = coot.coot_get_url_as_string_py(url)
if isinstance(result, bytes):
result = result.decode('utf-8')
Sorting by `sort=molecular_weight+asc` gives a 400 error. Use `polymer_length` instead
as a proxy for size, or use `number_of_polymer_residues` for entry-level size.
**5. Use `fq` (filter query) for range constraints**
Range filtering on numeric fields works well as a filter query:
```python
# Filter to entities with 5-30 residues:
url = "...&q=molecule_type:Protein&fq=polymer_length:[5+TO+30]&sort=polymer_length+asc..."
Example 2: Human kinase structures
query = "organism_scientific_name:\"Homo sapiens\" AND molecule_name:*kinase*"
url = f"https://www.ebi.ac.uk/pdbe/search/pdb/select?q={query}&wt=json&rows=10&fl=pdb_id,title,resolution"
Example 3: Cryo-EM structures better than 3Å from 2025
query = "release_year:2025 AND experimental_method:\"Electron Microscopy\" AND resolution:[* TO 3.0]"
url = f"https://www.ebi.ac.uk/pdbe/search/pdb/select?q={query}&wt=json&rows=10&fl=pdb_id,title,em_resolution"
Example 4: Structures with carbohydrates
query = "has_carb_polymer:Y"
url = f"https://www.ebi.ac.uk/pdbe/search/pdb/select?q={query}&wt=json&rows=10&fl=pdb_id,title"
Example 5: Structures of a specific protein from different species
# Find ABL1 structures from different mammals
query = "molecule_name:*ABL1* OR molecule_name:*ABL*kinase*"
url = f"https://www.ebi.ac.uk/pdbe/search/pdb/select?q={query}&wt=json&rows=100&fl=pdb_id,organism_scientific_name,title"
6. Discover available fields by fetching a sample document with fl=*
When you don't know what fields are in the index:
url = "https://www.ebi.ac.uk/pdbe/search/pdb/select?q=*:*&wt=json&rows=1&fl=*"
data = json.loads(coot.coot_get_url_as_string_py(url))
for k in sorted(data['response']['docs'][0].keys()):
print(k)
Fields prefixed q_ and t_ are query/text variants of the base fields — ignore them
when exploring the schema.
7. There is no disulfide or bond_types field in the Solr index
To find structures with disulfide bonds, you must:
/pdb/entry/molecules/{pdb_id}) to confirm the sequence and structure.Find structures with specific ligand:
query = "ligand_name:axitinib"
Find high-resolution kinase structures:
query = "molecule_name:kinase AND resolution:[0 TO 2.0]"
Find structures revised in 2024:
query = "revision_date:[20240101 TO 20241231]"
Check if a structure has been significantly revised since release:
import json
from datetime import datetime
def check_structure_revision(pdb_id):
"""Check if structure was revised and when"""
url = f"https://www.ebi.ac.uk/pdbe/api/pdb/entry/summary/{pdb_id}"
result = coot.coot_get_url_as_string_py(url)
if isinstance(result, bytes):
result = result.decode('utf-8')
data = json.loads(result)
entry = data[pdb_id][0]
release = entry['release_date']
revision = entry['revision_date']
# Convert to datetime for comparison
release_dt = datetime.strptime(release, "%Y%m%d")
revision_dt = datetime.strptime(revision, "%Y%m%d")
days_diff = (revision_dt - release_dt).days
years_diff = days_diff / 365.25
print(f"PDB {pdb_id}:")
print(f" Released: {release}")
print(f" Revised: {revision}")
print(f" Time since release: {years_diff:.1f} years")
if days_diff > 30:
print(f" WARNING: Structure revised {days_diff} days after release")
return True
return False
# Example usage
check_structure_revision("4wa9")
Determine if a ligand definition was updated, which might explain geometry changes:
import json
def check_ligand_revision(comp_id):
"""Check when a ligand was last revised"""
url = f"https://www.ebi.ac.uk/pdbe/api/pdb/compound/summary/{comp_id}"
result = coot.coot_get_url_as_string_py(url)
if isinstance(result, bytes):
result = result.decode('utf-8')
data = json.loads(result)
compound = data[comp_id][0]
print(f"Compound {comp_id} ({compound['name']}):")
print(f" Created: {compound['creation_date']}")
print(f" Revised: {compound['revision_date']}")
if compound['creation_date'] != compound['revision_date']:
print(f" WARNING: Ligand definition was revised")
return True
return False
# Example usage
check_ligand_revision("AXI")
Download structures from different species and compare them:
import gzip
import json
def download_and_load_structure(pdb_id):
"""Download and load a structure from RCSB"""
url = f"https://files.rcsb.org/download/{pdb_id}.cif.gz"
print(f"Downloading {pdb_id}...")
compressed = coot.coot_get_url_as_string_py(url)
# Check if it's an error response (HTML)
if isinstance(compressed, str) and compressed.startswith("<!DOCTYPE"):
print(f"ERROR: Could not download {pdb_id}")
return None
decompressed = gzip.decompress(compressed)
imol = coot.read_coordinates_as_string(decompressed.decode('utf-8'), pdb_id)
print(f"Loaded as molecule {imol}")
return imol
def compare_species_structures(pdb_id1, pdb_id2):
"""Download two structures and superpose them"""
# Download both structures
imol1 = download_and_load_structure(pdb_id1)
imol2 = download_and_load_structure(pdb_id2)
if imol1 is None or imol2 is None:
print("Failed to download one or both structures")
return
# Superpose (using CA atoms from chain A, residues 240-400)
print(f"\nSuperposing {pdb_id2} onto {pdb_id1}...")
sel1 = "//A/240-400/CA"
sel2 = "//A/240-400/CA"
result = coot.superpose_with_atom_selection(imol1, imol2, sel1, sel2, 0)
if result >= 0:
print(f"Success! Structures superposed.")
else:
print("Superposition failed!")
return imol1, imol2
# Example: Compare human and mouse ABL1
# First find structures using Solr
query = "molecule_name:*ABL1*"
url = f"https://www.ebi.ac.uk/pdbe/search/pdb/select?q={query}&wt=json&rows=100&fl=pdb_id,organism_scientific_name"
result = coot.coot_get_url_as_string_py(url)
if isinstance(result, bytes):
result = result.decode('utf-8')
data = json.loads(result)
# Find human and mouse structures
human_pdbs = []
mouse_pdbs = []
for doc in data['response']['docs']:
org = doc.get('organism_scientific_name', ['Unknown'])[0]
if 'Homo sapiens' in org:
human_pdbs.append(doc['pdb_id'])
elif 'Mus musculus' in org:
mouse_pdbs.append(doc['pdb_id'])
print(f"Human ABL1 structures: {len(human_pdbs)}")
print(f"Mouse ABL1 structures: {len(mouse_pdbs)}")
# Compare first human and mouse structures
if human_pdbs and mouse_pdbs:
compare_species_structures(human_pdbs[0], mouse_pdbs[0])
Search for structures with the same ligand and protein:
import json
def find_related_structures(protein_name, ligand_name=None):
"""Find structures containing specific protein-ligand combination"""
if ligand_name:
query = f'molecule_name:*{protein_name}* AND chem_comp_id:{ligand_name}'
else:
query = f'molecule_name:*{protein_name}*'
url = f"https://www.ebi.ac.uk/pdbe/search/pdb/select?q={query}&wt=json&rows=50&fl=pdb_id,title,resolution,organism_scientific_name"
result = coot.coot_get_url_as_string_py(url)
if isinstance(result, bytes):
result = result.decode('utf-8')
data = json.loads(result)
print(f"Found {data['response']['numFound']} structures")
for doc in data['response']['docs']:
org = doc.get('organism_scientific_name', ['N/A'])
if isinstance(org, list):
org = org[0] if org else 'N/A'
print(f" {doc['pdb_id']}: {doc.get('title', 'N/A')[:60]}")
print(f" Resolution: {doc.get('resolution', 'N/A')} Å")
print(f" Organism: {org}")
# Example usage
find_related_structures("ABL1", "STI") # ABL1 with imatinib
Always wrap API calls in try/except blocks and handle both string and bytes responses:
import json
def safe_pdbe_query(url):
"""Safely query PDBe API with error handling"""
try:
result = coot.coot_get_url_as_string_py(url)
if not result or result == "":
print(f"Empty response from {url}")
return None
# Handle bytes response
if isinstance(result, bytes):
result = result.decode('utf-8')
# Check for HTML error pages
if result.startswith("<!DOCTYPE") or result.startswith("<html"):
print(f"Received HTML error page instead of JSON")
print(result[:200])
return None
data = json.loads(result)
return data
except json.JSONDecodeError as e:
print(f"JSON parsing error: {e}")
print(f"Response was: {result[:200]}...")
return None
except Exception as e:
print(f"Error querying PDBe API: {e}")
return None
# Example usage
data = safe_pdbe_query("https://www.ebi.ac.uk/pdbe/api/pdb/entry/summary/4wa9")
if data:
print("Success!")
The function returns bytes for binary data (gzipped files) and str for text (JSON). Always check the type:
result = coot.coot_get_url_as_string_py(url)
if isinstance(result, bytes):
# Binary data - might be gzipped
if result.startswith(b'\x1f\x8b'): # gzip magic bytes
import gzip
decompressed = gzip.decompress(result)
content = decompressed.decode('utf-8')
else:
content = result.decode('utf-8')
else:
# Already a string
content = result
Some validation endpoints return multiple JSON objects or malformed responses. Handle carefully:
# Instead of json.loads(), parse line by line or handle errors
try:
data = json.loads(result)
except json.JSONDecodeError:
# Try alternative parsing or just display raw result
print("Could not parse JSON, raw response:")
print(result[:1000])
If you get UnicodeDecodeError, the response might contain non-UTF-8 bytes. This should be handled automatically by the function now, but if you encounter issues:
try:
result = coot.coot_get_url_as_string_py(url)
except Exception as e:
print(f"Error fetching URL: {e}")
Always encode special characters in Solr queries:
import urllib.parse
query = "molecule_name:\"Protein kinase\" AND resolution:[0 TO 2.0]"
encoded = urllib.parse.quote(query)
url = f"https://www.ebi.ac.uk/pdbe/search/pdb/select?q={encoded}&wt=json"
The PDBe API may rate limit excessive requests. Add delays between batch queries:
import time
pdb_ids = ["4wa9", "2hyy", "1iep"]
for pdb_id in pdb_ids:
data = safe_pdbe_query(f"https://www.ebi.ac.uk/pdbe/api/pdb/entry/summary/{pdb_id}")
# Process data...
time.sleep(0.5) # Wait 500ms between requests
| Purpose | Endpoint |
|---|---|
| Structure summary | /pdb/entry/summary/{pdb_id} |
| Molecule/organism info | /pdb/entry/molecules/{pdb_id} |
| Compound info | /pdb/compound/summary/{comp_id} |
| Validation outliers | /validation/residuewise_outlier_summary/entry/{pdb_id} |
| Structure status | /pdb/entry/status/{pdb_id} |
| Ligand binding sites | /pdb/entry/ligand_monomers/{pdb_id} |
| Search structures | /search/pdb/select?q={query} |
| Download coordinates | https://files.rcsb.org/download/{pdb_id}.cif.gz |
| Query | Purpose |
|---|---|
pdb_id:4wa9 | Specific PDB entry |
molecule_name:*kinase* | By protein name (wildcards) |
chem_comp_id:ATP | Structures with specific ligand |
resolution:[0 TO 2.0] | High resolution structures |
release_year:2025 | Structures from 2025 |
revision_year:2024 | Recently revised structures |
experimental_method:"X-ray diffraction" | By experimental method |
organism_scientific_name:"Homo sapiens" | By organism |
has_carb_polymer:Y | Has carbohydrate |
has_bound_molecule:Y | Has ligands |
# Human kinases with resolution < 2Å from 2024
query = 'release_year:2024 AND organism_scientific_name:"Homo sapiens" AND molecule_name:*kinase* AND resolution:[* TO 2.0]'
# ABL1 from human OR mouse
query = 'molecule_name:*ABL1* AND (organism_scientific_name:"Homo sapiens" OR organism_scientific_name:"Mus musculus")'
import json
def structure_quality_report(imol):
"""Generate quality report using PDBe API data"""
# Get PDB ID from molecule
pdb_file = coot.molecule_name(imol)
# Extract PDB ID from filename (assumes format like "pdb4wa9.ent" or "4wa9")
import re
match = re.search(r'(\d\w{3})', pdb_file.lower())
if not match:
print("Could not extract PDB ID from filename")
return
pdb_id = match.group(1)
# Get structure info
data = safe_pdbe_query(f"https://www.ebi.ac.uk/pdbe/api/pdb/entry/summary/{pdb_id}")
if not data:
return
entry = data[pdb_id][0]
print("=" * 60)
print(f"STRUCTURE QUALITY REPORT: {pdb_id.upper()}")
print("=" * 60)
print(f"Title: {entry['title']}")
print(f"Method: {entry['experimental_method']}")
print(f"Released: {entry['release_date']}")
print(f"Revised: {entry['revision_date']}")
# Check for significant revisions
if entry['revision_date'] != entry['release_date']:
from datetime import datetime
release = datetime.strptime(entry['release_date'], "%Y%m%d")
revision = datetime.strptime(entry['revision_date'], "%Y%m%d")
days = (revision - release).days
print(f"\n⚠️ STRUCTURE REVISED {days} days after release")
print(" Check PDBe for revision details")
print("=" * 60)
# Usage: structure_quality_report(0)
The PDBe API provides rich programmatic access to structure metadata, validation data, and search capabilities. Using coot.coot_get_url_as_string_py(), you can:
Key Capabilities:
Key Limitations:
This enables powerful automated quality checks, cross-species structure comparison, data-driven validation, and integration of PDB metadata into Coot-based structural biology workflows.