ワンクリックで
coot-validation
// Comprehensive structure validation combining model-to-map analysis and unmodeled density detection
// Comprehensive structure validation combining model-to-map analysis and unmodeled density detection
API documentation to be loaded at startup - when starting a Coot session, immediately call get_function_descriptions() with the functions listed in this skill.
Best Practices for using Coot MCP
Using Density-Fit Correlations in Coot
Best Practices for Model-Building Tools and Refinement
Best practices for protein structure refinement and validation in Coot. Use when performing (1) Residue refinement operations, (2) Model building and fitting, (3) Rotamer fixing, (4) Scripted/automated refinement workflows, (5) Validation and correlation checking.
Best practices for creating publication-quality molecular graphics figures in Coot using user-defined colors, ribbons, and molecular representations
| name | coot-validation |
| description | Comprehensive structure validation combining model-to-map analysis and unmodeled density detection |
When performing structure validation in Coot with both a model and a map, you need to analyze the structure from three complementary perspectives:
All three perspectives are essential for comprehensive validation.
These functions analyze how well your current model fits the density. They lead you to places in the model that need attention:
CRITICAL: rotamer_graphs_py() and score_rotamers_py() report different things
There are two functions that report rotamer information, and they measure fundamentally different aspects:
rotamer_graphs_py(imol) - Continuous Probability DensityReturns the probability density at the exact chi angles of the current conformation.
rotamers = coot.rotamer_graphs_py(0)
# Returns: [[chain_id, resno, ins_code, score_percentage, resname], ...]
# score_percentage is the continuous probability density at the actual chi angles
Interpretation guidelines:
score_rotamers_py(...) - Discrete Bin ProbabilitiesReturns the discrete rotamer library showing what % of structures have each named rotamer.
rotamers = coot.score_rotamers_py(0, "A", 42, "", "", 1, 1, 0.001)
# Returns: [[name, probability, density_score, atom_list, richardson_score], ...]
# probability is the discrete bin frequency (e.g., m-85 appears in 43% of structures)
Why the scores differ:
Number of chi angles determines score expectations:
| Residue Type | Chi Angles | Typical # Rotamers | Best Rotamer % | Good Score |
|---|---|---|---|---|
| VAL, THR, SER | 1 | 3 | 70-75% | > 40% |
| PHE, TYR, ASP, ASN | 2 | 4-9 | 35-45% | > 20% |
| GLU, GLN, MET, ILE, LEU | 3 | 9-15 | 20-35% | > 10% |
| LYS, ARG | 4 | 30-35 | 9-10% | > 5% |
Key insight: More chi angles = more rotamers = lower individual probabilities
Examples from validation:
DON'T: Use arbitrary cutoffs like "< 10% is bad"
DO: Use context-aware validation:
# Step 1: Get current rotamer scores
rotamers = coot.rotamer_graphs_py(0)
# Step 2: For flagged residues, get alternatives
for chain, resno, inscode, score, resname in rotamers:
if score < 20: # Preliminary flag
# Get all possible rotamers to understand context
alternatives = coot.score_rotamers_py(0, chain, resno, "", "", 1, 1, 0.001)
if len(alternatives) == 0:
continue # GLY, ALA - no rotamers
# Sort by density fit
sorted_alts = sorted(alternatives, key=lambda x: x[2], reverse=True)
best_density = sorted_alts[0][2]
current_density = sorted_alts[0][2] # Approximate
# Check how many rotamers exist
n_rotamers = len(alternatives)
# Decision logic
if n_rotamers < 5 and score < 10:
# Few rotamers (VAL, PHE, etc.) and low score = likely wrong
print(f"PROBLEM: {chain}/{resno} {resname}: {score:.1f}% (few alternatives)")
elif n_rotamers > 20 and score < 2:
# Many rotamers (LYS, ARG) and very low score = likely wrong
print(f"PROBLEM: {chain}/{resno} {resname}: {score:.1f}% (many alternatives)")
elif best_density - current_density > 3.0:
# Alternative has much better density fit
print(f"PROBLEM: {chain}/{resno} {resname}: better rotamer available")
A residue needs fixing if:
A low rotamer score alone is NOT sufficient - always check:
Atom overlap detection identifies clashes between atoms that may not be caught by local geometry validation. These reveal packing problems such as:
Critical insight: Ramachandran and rotamer validation catch local geometry problems (within a residue or its immediate neighbors), while atom overlap detection catches global packing problems between any atoms in the structure.
The blob-finding function identifies regions of significant density that are not explained by your current model. It leads you to places in the map where you might be missing:
Always include blob detection when performing structure validation with a map.
blobs = coot.find_blobs_py(
imol_model=0, # your protein model
imol_map=1, # the map to search (often difference map)
cut_off_density_level=3.0 # sigma threshold (typically 2.5-4.0)
)
# Returns: list of (position, score) tuples
# [(clipper::Coord_orth, float), ...]
imol_model: The model molecule - density explained by this model will be excludedimol_map: The map to search for blobs (usually a difference map, but can be regular map)cut_off_density_level: Sigma threshold for blob detection
for position, score in blobs:
x = position.x()
y = position.y()
z = position.z()
print(f"Blob at ({x:.2f}, {y:.2f}, {z:.2f}) - score: {score:.2f}")
The score represents the strength/volume of the unmodeled density. Higher scores indicate more significant features that should be investigated.
The user likes to see what you are considering and how you change the model, so, if you can, try to use coot.set_rotation_centre() or coot.set_go_to_atom_chain_residue_atom_name() or some such to bring the currently interesting issue to the centre of the screen.
# Ramachandran outliers
rama_outliers = coot.all_molecule_ramachandran_score_py(0)
# Rotamer outliers
rotamer_outliers = coot.rotamer_graphs_py(0)
# Per-residue density correlation
correlation_stats = coot.map_to_model_correlation_stats_per_residue_range_py(
0, # imol_model
"A", # chain_id
1, # start_resno
100, # end_resno
1 # imol_map
)
# Geometry validation
chiral = coot.chiral_volume_errors_py(0)
# Get worst 30 atom overlaps
overlaps = coot.molecule_atom_overlaps_py(0, 30)
# Check for severe clashes
severe_clashes = [o for o in overlaps if o['overlap-volume'] > 5.0]
if severe_clashes:
print(f"WARNING: {len(severe_clashes)} severe clashes found!")
# For full analysis (caution: can be very large!)
# all_overlaps = coot.molecule_atom_overlaps_py(0, -1)
# Find unmodeled density in difference map
diff_map_blobs = coot.find_blobs_py(
imol_model=0,
imol_map=2, # difference map
cut_off_density_level=3.0
)
# Find features in regular map (alternative approach)
regular_map_blobs = coot.find_blobs_py(
imol_model=0,
imol_map=1, # 2mFo-DFc map
cut_off_density_level=1.0 # Lower threshold for fitted map
)
def comprehensive_validation(imol_model, imol_map, imol_diff_map=None):
"""
Perform complete structure validation combining model and map analysis.
Returns dictionary with all validation metrics.
"""
results = {}
# Model-to-map validation
results['ramachandran'] = coot.all_molecule_ramachandran_score_py(imol_model)
results['rotamers'] = coot.rotamer_graphs_py(imol_model)
# Atom overlap validation
results['atom_overlaps'] = coot.molecule_atom_overlaps_py(imol_model, 30)
severe_clashes = [o for o in results['atom_overlaps'] if o['overlap-volume'] > 5.0]
results['severe_clash_count'] = len(severe_clashes)
# Per-residue correlation (requires chain info)
import coot_utils
chains = coot_utils.chain_ids(imol_model)
results['correlation_by_chain'] = {}
for chain in chains:
n_residues = coot.chain_n_residues(chain, imol_model)
if n_residues > 0:
stats = coot.map_to_model_correlation_stats_per_residue_range_py(
imol_model, chain, 1, 9999, imol_map
)
results['correlation_by_chain'][chain] = stats
# Map-to-model validation (blobs)
if imol_diff_map is not None:
results['diff_map_blobs'] = coot.find_blobs_py(
imol_model, imol_diff_map, 3.0
)
results['map_blobs'] = coot.find_blobs_py(
imol_model, imol_map, 1.0
)
return results
# Usage
validation = comprehensive_validation(
imol_model=0,
imol_map=1,
imol_diff_map=2
)
Difference Map (mFo-DFc) Blobs:
Regular Map (2mFo-DFc) Blobs:
blobs = coot.find_blobs_py(0, 2, 3.0) # diff map, 3 sigma
# Large score (>50): Likely missing ligand, metal, or several waters
# Medium score (10-50): Likely 1-3 waters or alternative conformation
# Small score (3-10): Likely single water or weak alternative conformation
for position, score in blobs:
if score > 50:
print(f"Large feature at {position} - investigate for ligand/metal")
elif score > 10:
print(f"Medium feature at {position} - likely waters")
else:
print(f"Small feature at {position} - check carefully")
Always include atom overlap checking when validating structure geometry.
# Get worst 30 atom overlaps (default behavior after API update)
overlaps = coot.molecule_atom_overlaps_py(
imol=0,
n_pairs=30 # Number of worst overlaps to return (default: 30)
)
# Get ALL overlaps (use with caution - can be hundreds!)
all_overlaps = coot.molecule_atom_overlaps_py(
imol=0,
n_pairs=-1 # -1 means return all overlaps
)
# Each overlap is a dict with:
# {
# 'atom-1-spec': [imol, chain, resno, inscode, atom_name, altconf],
# 'atom-2-spec': [imol, chain, resno, inscode, atom_name, altconf],
# 'overlap-volume': float, # in Ų
# 'radius-1': float,
# 'radius-2': float
# }
Overlap volume indicates severity:
Common clash patterns:
overlaps = coot.molecule_atom_overlaps_py(0, 30)
for overlap in overlaps:
atom1 = overlap['atom-1-spec']
atom2 = overlap['atom-2-spec']
volume = overlap['overlap-volume']
chain1, res1, atom_name1 = atom1[1], atom1[2], atom1[4]
chain2, res2, atom_name2 = atom2[1], atom2[2], atom2[4]
if volume > 5.0:
print(f"SEVERE: {chain1}/{res1} {atom_name1} ↔ {chain2}/{res2} {atom_name2}: {volume:.2f} Ų")
elif volume > 2.0:
print(f"MODERATE: {chain1}/{res1} {atom_name1} ↔ {chain2}/{res2} {atom_name2}: {volume:.2f} Ų")
Example from tutorial data:
Key lesson: A model can have perfect Ramachandran and rotamer scores but catastrophic packing problems. You need both local geometry validation (Rama/rotamer) AND global packing validation (overlaps).
CRITICAL: Never use absolute rotamer score thresholds without considering residue type
Before prioritizing rotamer fixes, understand what's "bad" for each residue:
def assess_rotamer_severity(chain, resno, score, resname):
"""
Determine if a rotamer score is actually problematic.
Returns: 'critical', 'moderate', 'acceptable', or 'good'
"""
# Get all possible rotamers to understand the distribution
alternatives = coot.score_rotamers_py(0, chain, resno, "", "", 1, 1, 0.001)
n_rotamers = len(alternatives)
# Context-aware thresholds based on number of possible rotamers
if n_rotamers <= 3: # VAL, THR, SER (1 chi)
if score < 10: return 'critical'
elif score < 30: return 'moderate'
else: return 'acceptable'
elif n_rotamers <= 9: # PHE, TYR, etc. (2 chi)
if score < 5: return 'critical'
elif score < 15: return 'moderate'
else: return 'acceptable'
elif n_rotamers <= 15: # GLU, GLN, MET (3 chi)
if score < 3: return 'critical'
elif score < 10: return 'moderate'
else: return 'acceptable'
else: # LYS, ARG (4 chi, 30+ rotamers)
if score < 1: return 'critical'
elif score < 5: return 'moderate'
else: return 'acceptable'
Priority 1: Combined problems (multiple red flags)
Priority 2: Single severe issues
Important: A low rotamer score with GOOD density correlation (>0.8) may be correct - it could be a genuine unusual but real conformation. Don't "fix" it unless there's supporting evidence (clashes, poor density, chemical implausibility).
score_rotamers_py()def validate_and_fix_chain(imol_model, chain_id, imol_map, imol_diff_map):
"""
Automated validation and suggested fixes for a chain.
"""
issues = []
# 1. Check for atom overlaps
overlaps = coot.molecule_atom_overlaps_py(imol_model, 50)
for overlap in overlaps:
atom1 = overlap['atom-1-spec']
atom2 = overlap['atom-2-spec']
volume = overlap['overlap-volume']
# Only report if at least one atom is in this chain
if atom1[1] == chain_id or atom2[1] == chain_id:
severity = 'high' if volume > 5.0 else ('medium' if volume > 2.0 else 'low')
issues.append({
'type': 'atom_overlap',
'atom1': f"{atom1[1]}/{atom1[2]} {atom1[4]}",
'atom2': f"{atom2[1]}/{atom2[2]} {atom2[4]}",
'severity': severity,
'value': volume
})
# 2. Check correlation for each residue
stats = coot.map_to_model_correlation_stats_per_residue_range_py(
imol_model, chain_id, 1, 9999, imol_map
)
for residue_spec, correlation in stats:
if correlation < 0.7: # Poor fit threshold
issues.append({
'type': 'poor_correlation',
'residue': residue_spec,
'severity': 'high',
'value': correlation
})
# 3. Find nearby blobs that might explain poor correlation
blobs = coot.find_blobs_py(imol_model, imol_diff_map, 3.0)
for position, score in blobs:
issues.append({
'type': 'unmodeled_density',
'position': (position.x(), position.y(), position.z()),
'severity': 'high' if score > 50 else 'medium',
'score': score
})
# 4. Check Ramachandran
rama = coot.all_molecule_ramachandran_score_py(imol_model)
for outlier in rama:
if outlier[4] == 'OUTLIER': # Ramachandran region
issues.append({
'type': 'ramachandran_outlier',
'residue': outlier[0:3], # chain, resno, inscode
'severity': 'high'
})
return sorted(issues, key=lambda x: {'high': 0, 'medium': 1, 'low': 2}[x['severity']])
# Usage
issues = validate_and_fix_chain(0, "A", 1, 2)
for issue in issues[:10]: # Top 10 issues
print(f"{issue['type']}: {issue}")
# Find blobs in difference map
blobs = coot.find_blobs_py(0, 2, 3.0)
# Add waters at blob positions
for position, score in blobs:
if 5 < score < 30: # Typical water blob size
# Check if appropriate for water
x, y, z = position.x(), position.y(), position.z()
# Add water at this position
coot.place_typed_atom_at_pointer("HOH")
# Look for large blobs that might be missing residues
blobs = coot.find_blobs_py(0, 2, 3.0)
missing_residue_candidates = [
(pos, score) for pos, score in blobs
if score > 100 # Large feature
]
for position, score in missing_residue_candidates:
print(f"Large unmodeled density at {position} - check for missing residues")
get_hydrogen_bonds_py()coot.get_hydrogen_bonds_py(imol, selection_1, selection_2, mcdonald_and_thornton)
Parameters:
imol: model molecule indexselection_1: MMDB selection string for first group (e.g. "//A/35")selection_2: MMDB selection string for second group (e.g. "//A/34-56")mcdonald_and_thornton: 1 = use McDonald & Thornton algorithm (requires H atoms); 0 = geometry-onlyReturns: list of H-bond candidates. Each entry is a list of 12 elements:
[0] hydrogen atom (dict, or None if no H)
[1] donor atom (dict)
[2] acceptor atom (dict)
[3] donor neighbour/antecedent atom (dict, or None)
[4] acceptor neighbour/antecedent atom (dict, or None)
[5] angle_1 (float, degrees)
[6] angle_2 (float, degrees)
[7] angle_3 (float, degrees)
[8] distance (float, Å)
[9] ligand_atom_is_donor (bool)
[10] hydrogen_is_ligand_atom (bool)
[11] bond_has_hydrogen_flag (bool)
Each atom dict has keys: x, y, z, charge, occ, b_iso, element, name, model, chain, altLoc, residue_name
IMPORTANT: Always use mcdonald_and_thornton=0 unless the model has explicit hydrogens.
The function returns all geometrically plausible H-bond candidates — distance alone is not
sufficient to confirm a hydrogen bond; the angles must also be checked.
Example:
hbonds = coot.get_hydrogen_bonds_py(0, "//A/35", "//A/50-56", 0)
for hb in hbonds:
donor = hb[1]
acceptor = hb[2]
dist = hb[8]
has_H = hb[11]
d_str = donor['chain'] + " " + donor['residue_name'] + " " + donor['name'].strip()
a_str = acceptor['chain'] + " " + acceptor['residue_name'] + " " + acceptor['name'].strip()
print("H-bond: " + d_str + " -> " + a_str + " dist=" + str(dist) + " has_H=" + str(has_H))
# Rotamer validation - primary metric (continuous probability density)
rotamers = coot.rotamer_graphs_py(imol)
# Returns: [[chain_id, resno, ins_code, score_percentage, resname], ...]
# Rotamer alternatives - for understanding context
alternatives = coot.score_rotamers_py(imol, chain, resno, "", "", imol_map, 1, 0.001)
# Returns: [[name, probability, density_score, atom_list, richardson_score], ...]
# Atom overlap detection
overlaps = coot.molecule_atom_overlaps_py(imol, n_pairs=30) # Default: 30 worst
all_overlaps = coot.molecule_atom_overlaps_py(imol, n_pairs=-1) # All overlaps
# Blob detection (map-to-model)
blobs = coot.find_blobs_py(imol_model, imol_map, sigma_cutoff)
# Ramachandran validation
rama = coot.all_molecule_ramachandran_score_py(imol)
# Rotamer validation
rotamers = coot.rotamer_graphs_py(imol)
# Density correlation (model-to-map)
corr = coot.map_to_model_correlation_stats_per_residue_range_py(
imol, chain, imol_map, n_per_range, exclude_NOC_flag
)
# Geometry validation
chiral = coot.chiral_volume_errors_py(imol)
Remember: