一键导入
cobrapy
Constraint-based metabolic modeling (COBRA). FBA, FVA, gene knockouts, flux sampling, SBML models, for systems biology and metabolic engineering analysis.
菜单
Constraint-based metabolic modeling (COBRA). FBA, FVA, gene knockouts, flux sampling, SBML models, for systems biology and metabolic engineering analysis.
Build with and use Pi, the minimal terminal coding harness. Use for installing Pi, configuring providers/models/settings, creating Pi skills/extensions/packages/themes/prompt templates, embedding Pi through the SDK, integrating over RPC or JSON event streams, parsing sessions, or developing custom Pi providers and TUI components.
Query the CZ CELLxGENE Census programmatically for versioned public single-cell and spatial transcriptomics data. Use when you need population-scale cell metadata, gene expression slices, Census summary counts, source H5AD URIs/downloads, embeddings, spatial Census data, or reference atlas comparisons across organisms, tissues, diseases, assays, and cell types. For analyzing your own local single-cell data use scanpy, anndata, or scvi-tools.
NGS analysis toolkit. BAM to bigWig conversion, QC (correlation, PCA, fingerprints), heatmaps/profiles (TSS, peaks), for ChIP-seq, RNA-seq, ATAC-seq visualization.
DiffDock and DiffDock-L molecular docking. Use for protein-small-molecule pose prediction from PDB or sequence plus SMILES/SDF/MOL2, batch docking, virtual screening, and pose-confidence interpretation. Not for binding affinity prediction.
Use when working directly with the `esm` Python SDK, ESM3 or ESMC model IDs, Forge/Biohub inference clients, or ESMFold2 folding workflows.
Fast CLI/Python queries to 20+ bioinformatics databases. Use for quick lookups: gene info, BLAST/BLAT, viral sequence downloads, AlphaFold structures, enrichment analysis, OpenTargets, COSMIC, CELLxGENE, and 8cube mouse specificity/expression data. Best for interactive exploration and simple queries. For batch processing or advanced BLAST use biopython; for multi-database Python workflows use bioservices.
| name | cobrapy |
| description | Constraint-based metabolic modeling (COBRA). FBA, FVA, gene knockouts, flux sampling, SBML models, for systems biology and metabolic engineering analysis. |
| license | GPL-2.0 license |
| allowed-tools | Read Write Edit Bash |
| compatibility | Requires Python 3.9+ (cobra 0.30+ dropped 3.8). Install with uv pip install. GLPK (swiglpk) is the default solver; CPLEX/Gurobi optional. load_model fetches from bundled data, BiGG, or BioModels (network required for remote models). |
| metadata | {"version":"1.1","skill-author":"K-Dense Inc."} |
COBRApy is a Python library for constraint-based reconstruction and analysis (COBRA) of metabolic models, essential for systems biology research. Work with genome-scale metabolic models, perform computational simulations of cellular metabolism, conduct metabolic engineering analyses, and predict phenotypic behaviors.
Version note: Examples target cobra 0.31.1 on PyPI (import cobra). Docs: cobrapy.readthedocs.io. Repo: opencobra/cobrapy.
Use this skill when:
uv pip install "cobra==0.31.1"
MATLAB model I/O (optional):
uv pip install "cobra[array]==0.31.1"
COBRApy uses optlang for solvers. GLPK installs automatically via swiglpk. For large MILPs/QPs, cobra 0.29+ adds a hybrid solver (HIGHS/OSQP); model.solver = "osqp" now routes through hybrid and may error on plain LPs in a future release—prefer model.solver = "hybrid" when available.
COBRApy provides comprehensive tools organized into several key areas:
Load existing models from repositories or files:
from cobra.io import load_model
# Bundled locally (no network): textbook, iJO1366, salmonella
model = load_model("textbook") # alias for e_coli_core (95 reactions)
model = load_model("e_coli_core") # same core E. coli model
model = load_model("iJO1366") # genome-scale E. coli (bundled)
model = load_model("salmonella") # Salmonella iYS1720 (bundled)
# Remote (BiGG / BioModels; requires network, cached after first fetch)
model = load_model("iML1515") # E. coli genome-scale on BiGG
# Load from files
from cobra.io import read_sbml_model, load_json_model, load_yaml_model
model = read_sbml_model("path/to/model.xml")
model = load_json_model("path/to/model.json")
model = load_yaml_model("path/to/model.yml")
Save models in various formats:
from cobra.io import write_sbml_model, save_json_model, save_yaml_model
write_sbml_model(model, "output.xml") # Preferred format
save_json_model(model, "output.json") # For Escher compatibility
save_yaml_model(model, "output.yml") # Human-readable
Access and inspect model components:
# Access components
model.reactions # DictList of all reactions
model.metabolites # DictList of all metabolites
model.genes # DictList of all genes
# Get specific items by ID or index
reaction = model.reactions.get_by_id("PFK")
metabolite = model.metabolites[0]
# Inspect properties
print(reaction.reaction) # Stoichiometric equation
print(reaction.bounds) # Flux constraints
print(reaction.gene_reaction_rule) # GPR logic
print(metabolite.formula) # Chemical formula
print(metabolite.compartment) # Cellular location
Perform standard FBA simulation:
# Basic optimization
solution = model.optimize()
print(f"Objective value: {solution.objective_value}")
print(f"Status: {solution.status}")
# Access fluxes
print(solution.fluxes["PFK"])
print(solution.fluxes.head())
# Fast optimization (objective value only)
objective_value = model.slim_optimize()
# Change objective
model.objective = "ATPM"
solution = model.optimize()
Parsimonious FBA (minimize total flux):
from cobra.flux_analysis import pfba
solution = pfba(model)
Geometric FBA (find central solution):
from cobra.flux_analysis import geometric_fba
solution = geometric_fba(model)
Determine flux ranges for all reactions:
from cobra.flux_analysis import flux_variability_analysis
# Standard FVA
fva_result = flux_variability_analysis(model)
# FVA at 90% optimality
fva_result = flux_variability_analysis(model, fraction_of_optimum=0.9)
# Loopless FVA (eliminates thermodynamically infeasible loops)
fva_result = flux_variability_analysis(model, loopless=True)
# FVA for specific reactions
fva_result = flux_variability_analysis(
model,
reaction_list=["PFK", "FBA", "PGI"]
)
Perform knockout analyses:
from cobra.flux_analysis import (
single_gene_deletion,
single_reaction_deletion,
double_gene_deletion,
double_reaction_deletion
)
# Single deletions
gene_results = single_gene_deletion(model)
reaction_results = single_reaction_deletion(model)
# Double deletions (uses multiprocessing)
double_gene_results = double_gene_deletion(
model,
processes=4 # Number of CPU cores
)
# Manual knockout using context manager
with model:
model.genes.get_by_id("b0008").knock_out()
solution = model.optimize()
print(f"Growth after knockout: {solution.objective_value}")
# Model automatically reverts after context exit
Manage growth medium:
# View current medium
print(model.medium)
# Modify medium (must reassign entire dict)
medium = model.medium
medium["EX_glc__D_e"] = 10.0 # Set glucose uptake
medium["EX_o2_e"] = 0.0 # Anaerobic conditions
model.medium = medium
# Calculate minimal media
from cobra.medium import minimal_medium
# Minimize total import flux
min_medium = minimal_medium(model, minimize_components=False)
# Minimize number of components (uses MILP, slower)
min_medium = minimal_medium(
model,
minimize_components=True,
open_exchanges=True
)
Sample the feasible flux space:
from cobra.sampling import sample
# Sample using OptGP (default, supports parallel processing)
samples = sample(model, n=1000, method="optgp", processes=4)
# Sample using ACHR
samples = sample(model, n=1000, method="achr")
# Validate samples
from cobra.sampling import OptGPSampler
sampler = OptGPSampler(model, processes=4)
sampler.sample(1000)
validation = sampler.validate(sampler.samples)
print(validation.value_counts()) # Should be all 'v' for valid
Calculate phenotype phase planes:
from cobra.flux_analysis import production_envelope
# Standard production envelope
envelope = production_envelope(
model,
reactions=["EX_glc__D_e", "EX_o2_e"],
objective="EX_ac_e" # Acetate production
)
# With carbon yield
envelope = production_envelope(
model,
reactions=["EX_glc__D_e", "EX_o2_e"],
carbon_sources="EX_glc__D_e"
)
# Visualize (use matplotlib or pandas plotting)
import matplotlib.pyplot as plt
envelope.plot(x="EX_glc__D_e", y="EX_o2_e", kind="scatter")
plt.show()
Add reactions to make models feasible:
from cobra.flux_analysis import gapfill
# Provide a universal reaction database (SBML/JSON); not bundled in cobra 0.31+
from cobra.io import read_sbml_model
universal = read_sbml_model("path/to/universal_reactions.xml")
# Perform gapfilling
with model:
# Remove reactions to create gaps for demonstration
model.remove_reactions([model.reactions.PGI])
# Find reactions needed
solution = gapfill(model, universal)
print(f"Reactions to add: {solution}")
Build models from scratch:
from cobra import Model, Reaction, Metabolite
# Create model
model = Model("my_model")
# Create metabolites
atp_c = Metabolite("atp_c", formula="C10H12N5O13P3",
name="ATP", compartment="c")
adp_c = Metabolite("adp_c", formula="C10H12N5O10P2",
name="ADP", compartment="c")
pi_c = Metabolite("pi_c", formula="HO4P",
name="Phosphate", compartment="c")
# Create reaction
reaction = Reaction("ATPASE")
reaction.name = "ATP hydrolysis"
reaction.subsystem = "Energy"
reaction.lower_bound = 0.0
reaction.upper_bound = 1000.0
# Add metabolites with stoichiometry
reaction.add_metabolites({
atp_c: -1.0,
adp_c: 1.0,
pi_c: 1.0
})
# Add gene-reaction rule
reaction.gene_reaction_rule = "(gene1 and gene2) or gene3"
# Add to model
model.add_reactions([reaction])
# Add boundary reactions
model.add_boundary(atp_c, type="exchange")
model.add_boundary(adp_c, type="demand")
# Set objective
model.objective = "ATPASE"
from cobra.io import load_model
# Load model (textbook = fast tutorial; iJO1366 / iML1515 for genome-scale)
model = load_model("textbook")
# Run FBA
solution = model.optimize()
print(f"Growth rate: {solution.objective_value:.3f} /h")
# Show active pathways
print(solution.fluxes[solution.fluxes.abs() > 1e-6])
from cobra.io import load_model
from cobra.flux_analysis import single_gene_deletion
# Load model
model = load_model("textbook")
baseline = model.slim_optimize()
# Perform single gene deletions
results = single_gene_deletion(model)
# Find essential genes (growth < threshold)
essential_genes = results[results["growth"] < 0.01]
print(f"Found {len(essential_genes)} essential genes")
# Find genes with minimal impact
neutral_genes = results[results["growth"] > 0.9 * baseline]
from cobra.io import load_model
from cobra.medium import minimal_medium
# Load model
model = load_model("textbook")
# Calculate minimal medium for 50% of max growth
target_growth = model.slim_optimize() * 0.5
min_medium = minimal_medium(
model,
target_growth,
minimize_components=True
)
print(f"Minimal medium components: {len(min_medium)}")
print(min_medium)
from cobra.io import load_model
from cobra.flux_analysis import flux_variability_analysis
from cobra.sampling import sample
# Load model
model = load_model("textbook")
# First check flux ranges at optimality
fva = flux_variability_analysis(model, fraction_of_optimum=1.0)
# For reactions with large ranges, sample to understand distribution
samples = sample(model, n=1000)
# Analyze specific reaction
reaction_id = "PFK"
import matplotlib.pyplot as plt
samples[reaction_id].hist(bins=50)
plt.xlabel(f"Flux through {reaction_id}")
plt.ylabel("Frequency")
plt.show()
Use context managers to make temporary modifications:
# Model remains unchanged outside context
with model:
# Temporarily change objective
model.objective = "ATPM"
# Temporarily modify bounds
model.reactions.EX_glc__D_e.lower_bound = -5.0
# Temporarily knock out genes
model.genes.b0008.knock_out()
# Optimize with changes
solution = model.optimize()
print(f"Modified growth: {solution.objective_value}")
# All changes automatically reverted
solution = model.optimize()
print(f"Original growth: {solution.objective_value}")
Models use DictList objects for reactions, metabolites, and genes - behaving like both lists and dictionaries:
# Access by index
first_reaction = model.reactions[0]
# Access by ID
pfk = model.reactions.get_by_id("PFK")
# Query methods
atp_reactions = model.reactions.query("atp")
Reaction bounds define feasible flux ranges:
lower_bound = 0, upper_bound > 0lower_bound < 0, upper_bound > 0.bounds to avoid inconsistenciesBoolean logic linking genes to reactions:
# AND logic (both required)
reaction.gene_reaction_rule = "gene1 and gene2"
# OR logic (either sufficient)
reaction.gene_reaction_rule = "gene1 or gene2"
# Complex logic
reaction.gene_reaction_rule = "(gene1 and gene2) or (gene3 and gene4)"
Special reactions representing metabolite import/export:
EX_ by conventionmodel.medium dictionarymodel.slim_optimize() to ensure feasibilityoptimal indicates successful solven and processes=1 on genome-scale modelsInfeasible solutions: Check medium constraints, reaction bounds, and model consistency
Slow optimization: Try different solvers (GLPK, CPLEX, Gurobi) via model.solver
Unbounded solutions: Verify exchange reactions have appropriate upper bounds
Import errors: Ensure correct file format and valid SBML identifiers
For detailed workflows and API patterns, refer to:
references/workflows.md - Comprehensive step-by-step workflow examplesreferences/api_quick_reference.md - Common function signatures and patternsOfficial documentation: https://cobrapy.readthedocs.io/en/latest/