COBRApy - Constraint-Based Reconstruction and Analysis
Overview
COBRApy is a Python library for constraint-based reconstruction and analysis (COBRA) of metabolic models, essential for systems biology research. Work with genome-scale metabolic models, perform computational simulations of cellular metabolism, conduct metabolic engineering analyses, and predict phenotypic behaviors.
Core Capabilities
COBRApy provides comprehensive tools organized into several key areas:
- Model Management
Load existing models from repositories or files:
from cobra.io import load_model
Load bundled test models
model = load_model("textbook") # E. coli core model model = load_model("ecoli") # Full E. coli model model = load_model("salmonella")
Load from files
from cobra.io import read_sbml_model, load_json_model, load_yaml_model model = read_sbml_model("path/to/model.xml") model = load_json_model("path/to/model.json") model = load_yaml_model("path/to/model.yml")
Save models in various formats:
from cobra.io import write_sbml_model, save_json_model, save_yaml_model write_sbml_model(model, "output.xml") # Preferred format save_json_model(model, "output.json") # For Escher compatibility save_yaml_model(model, "output.yml") # Human-readable
- Model Structure and Components
Access and inspect model components:
Access components
model.reactions # DictList of all reactions model.metabolites # DictList of all metabolites model.genes # DictList of all genes
Get specific items by ID or index
reaction = model.reactions.get_by_id("PFK") metabolite = model.metabolites[0]
Inspect properties
print(reaction.reaction) # Stoichiometric equation print(reaction.bounds) # Flux constraints print(reaction.gene_reaction_rule) # GPR logic print(metabolite.formula) # Chemical formula print(metabolite.compartment) # Cellular location
- Flux Balance Analysis (FBA)
Perform standard FBA simulation:
Basic optimization
solution = model.optimize() print(f"Objective value: {solution.objective_value}") print(f"Status: {solution.status}")
Access fluxes
print(solution.fluxes["PFK"]) print(solution.fluxes.head())
Fast optimization (objective value only)
objective_value = model.slim_optimize()
Change objective
model.objective = "ATPM" solution = model.optimize()
Parsimonious FBA (minimize total flux):
from cobra.flux_analysis import pfba solution = pfba(model)
Geometric FBA (find central solution):
from cobra.flux_analysis import geometric_fba solution = geometric_fba(model)
- Flux Variability Analysis (FVA)
Determine flux ranges for all reactions:
from cobra.flux_analysis import flux_variability_analysis
Standard FVA
fva_result = flux_variability_analysis(model)
FVA at 90% optimality
fva_result = flux_variability_analysis(model, fraction_of_optimum=0.9)
Loopless FVA (eliminates thermodynamically infeasible loops)
fva_result = flux_variability_analysis(model, loopless=True)
FVA for specific reactions
fva_result = flux_variability_analysis( model, reaction_list=["PFK", "FBA", "PGI"] )
- Gene and Reaction Deletion Studies
Perform knockout analyses:
from cobra.flux_analysis import ( single_gene_deletion, single_reaction_deletion, double_gene_deletion, double_reaction_deletion )
Single deletions
gene_results = single_gene_deletion(model) reaction_results = single_reaction_deletion(model)
Double deletions (uses multiprocessing)
double_gene_results = double_gene_deletion( model, processes=4 # Number of CPU cores )
Manual knockout using context manager
with model: model.genes.get_by_id("b0008").knock_out() solution = model.optimize() print(f"Growth after knockout: {solution.objective_value}")
Model automatically reverts after context exit
- Growth Media and Minimal Media
Manage growth medium:
View current medium
print(model.medium)
Modify medium (must reassign entire dict)
medium = model.medium medium["EX_glc__D_e"] = 10.0 # Set glucose uptake medium["EX_o2_e"] = 0.0 # Anaerobic conditions model.medium = medium
Calculate minimal media
from cobra.medium import minimal_medium
Minimize total import flux
min_medium = minimal_medium(model, minimize_components=False)
Minimize number of components (uses MILP, slower)
min_medium = minimal_medium( model, minimize_components=True, open_exchanges=True )
- Flux Sampling
Sample the feasible flux space:
from cobra.sampling import sample
Sample using OptGP (default, supports parallel processing)
samples = sample(model, n=1000, method="optgp", processes=4)
Sample using ACHR
samples = sample(model, n=1000, method="achr")
Validate samples
from cobra.sampling import OptGPSampler sampler = OptGPSampler(model, processes=4) sampler.sample(1000) validation = sampler.validate(sampler.samples) print(validation.value_counts()) # Should be all 'v' for valid
- Production Envelopes
Calculate phenotype phase planes:
from cobra.flux_analysis import production_envelope
Standard production envelope
envelope = production_envelope( model, reactions=["EX_glc__D_e", "EX_o2_e"], objective="EX_ac_e" # Acetate production )
With carbon yield
envelope = production_envelope( model, reactions=["EX_glc__D_e", "EX_o2_e"], carbon_sources="EX_glc__D_e" )
Visualize (use matplotlib or pandas plotting)
import matplotlib.pyplot as plt envelope.plot(x="EX_glc__D_e", y="EX_o2_e", kind="scatter") plt.show()
- Gapfilling
Add reactions to make models feasible:
from cobra.flux_analysis import gapfill
Prepare universal model with candidate reactions
universal = load_model("universal")
Perform gapfilling
with model: # Remove reactions to create gaps for demonstration model.remove_reactions([model.reactions.PGI])
# Find reactions needed
solution = gapfill(model, universal)
print(f"Reactions to add: {solution}")
10. Model Building
Build models from scratch:
from cobra import Model, Reaction, Metabolite
Create model
model = Model("my_model")
Create metabolites
atp_c = Metabolite("atp_c", formula="C10H12N5O13P3", name="ATP", compartment="c") adp_c = Metabolite("adp_c", formula="C10H12N5O10P2", name="ADP", compartment="c") pi_c = Metabolite("pi_c", formula="HO4P", name="Phosphate", compartment="c")
Create reaction
reaction = Reaction("ATPASE") reaction.name = "ATP hydrolysis" reaction.subsystem = "Energy" reaction.lower_bound = 0.0 reaction.upper_bound = 1000.0
Add metabolites with stoichiometry
reaction.add_metabolites({ atp_c: -1.0, adp_c: 1.0, pi_c: 1.0 })
Add gene-reaction rule
reaction.gene_reaction_rule = "(gene1 and gene2) or gene3"
Add to model
model.add_reactions([reaction])
Add boundary reactions
model.add_boundary(atp_c, type="exchange") model.add_boundary(adp_c, type="demand")
Set objective
model.objective = "ATPASE"
Common Workflows
Workflow 1: Load Model and Predict Growth
from cobra.io import load_model
Load model
model = load_model("ecoli")
Run FBA
solution = model.optimize() print(f"Growth rate: {solution.objective_value:.3f} /h")
Show active pathways
print(solution.fluxes[solution.fluxes.abs() > 1e-6])
Workflow 2: Gene Knockout Screen
from cobra.io import load_model from cobra.flux_analysis import single_gene_deletion
Load model
model = load_model("ecoli")
Perform single gene deletions
results = single_gene_deletion(model)
Find essential genes (growth < threshold)
essential_genes = results[results["growth"] < 0.01] print(f"Found {len(essential_genes)} essential genes")
Find genes with minimal impact
neutral_genes = results[results["growth"] > 0.9 * solution.objective_value]
Workflow 3: Media Optimization
from cobra.io import load_model from cobra.medium import minimal_medium
Load model
model = load_model("ecoli")
Calculate minimal medium for 50% of max growth
target_growth = model.slim_optimize() * 0.5 min_medium = minimal_medium( model, target_growth, minimize_components=True )
print(f"Minimal medium components: {len(min_medium)}") print(min_medium)
Workflow 4: Flux Uncertainty Analysis
from cobra.io import load_model from cobra.flux_analysis import flux_variability_analysis from cobra.sampling import sample
Load model
model = load_model("ecoli")
First check flux ranges at optimality
fva = flux_variability_analysis(model, fraction_of_optimum=1.0)
For reactions with large ranges, sample to understand distribution
samples = sample(model, n=1000)
Analyze specific reaction
reaction_id = "PFK" import matplotlib.pyplot as plt samples[reaction_id].hist(bins=50) plt.xlabel(f"Flux through {reaction_id}") plt.ylabel("Frequency") plt.show()
Workflow 5: Context Manager for Temporary Changes
Use context managers to make temporary modifications:
Model remains unchanged outside context
with model: # Temporarily change objective model.objective = "ATPM"
# Temporarily modify bounds
model.reactions.EX_glc__D_e.lower_bound = -5.0
# Temporarily knock out genes
model.genes.b0008.knock_out()
# Optimize with changes
solution = model.optimize()
print(f"Modified growth: {solution.objective_value}")
All changes automatically reverted
solution = model.optimize() print(f"Original growth: {solution.objective_value}")
Key Concepts
DictList Objects
Models use DictList objects for reactions, metabolites, and genes - behaving like both lists and dictionaries:
Access by index
first_reaction = model.reactions[0]
Access by ID
pfk = model.reactions.get_by_id("PFK")
Query methods
atp_reactions = model.reactions.query("atp")
Flux Constraints
Reaction bounds define feasible flux ranges:
-
Irreversible: lower_bound = 0, upper_bound > 0
-
Reversible: lower_bound < 0, upper_bound > 0
-
Set both bounds simultaneously with .bounds to avoid inconsistencies
Gene-Reaction Rules (GPR)
Boolean logic linking genes to reactions:
AND logic (both required)
reaction.gene_reaction_rule = "gene1 and gene2"
OR logic (either sufficient)
reaction.gene_reaction_rule = "gene1 or gene2"
Complex logic
reaction.gene_reaction_rule = "(gene1 and gene2) or (gene3 and gene4)"
Exchange Reactions
Special reactions representing metabolite import/export:
-
Named with prefix EX_ by convention
-
Positive flux = secretion, negative flux = uptake
-
Managed through model.medium dictionary
Best Practices
-
Use context managers for temporary modifications to avoid state management issues
-
Validate models before analysis using model.slim_optimize() to ensure feasibility
-
Check solution status after optimization - optimal indicates successful solve
-
Use loopless FVA when thermodynamic feasibility matters
-
Set fraction_of_optimum appropriately in FVA to explore suboptimal space
-
Parallelize computationally expensive operations (sampling, double deletions)
-
Prefer SBML format for model exchange and long-term storage
-
Use slim_optimize() when only objective value needed for performance
-
Validate flux samples to ensure numerical stability
Troubleshooting
Infeasible solutions: Check medium constraints, reaction bounds, and model consistency Slow optimization: Try different solvers (GLPK, CPLEX, Gurobi) via model.solver
Unbounded solutions: Verify exchange reactions have appropriate upper bounds Import errors: Ensure correct file format and valid SBML identifiers
References
For detailed workflows and API patterns, refer to:
-
references/workflows.md
-
Comprehensive step-by-step workflow examples
-
references/api_quick_reference.md
-
Common function signatures and patterns
Official documentation: https://cobrapy.readthedocs.io/en/latest/