COBRApy - Constraint-Based Reconstruction and Analysis

Overview

COBRApy is a Python library for constraint-based reconstruction and analysis (COBRA) of metabolic models, essential for systems biology research. Work with genome-scale metabolic models, perform computational simulations of cellular metabolism, conduct metabolic engineering analyses, and predict phenotypic behaviors.

Core Capabilities

COBRApy provides comprehensive tools organized into several key areas:

Model Management

Load existing models from repositories or files:

from cobra.io import load_model

Load bundled test models

model = load_model("textbook") # E. coli core model model = load_model("ecoli") # Full E. coli model model = load_model("salmonella")

Load from files

from cobra.io import read_sbml_model, load_json_model, load_yaml_model model = read_sbml_model("path/to/model.xml") model = load_json_model("path/to/model.json") model = load_yaml_model("path/to/model.yml")

Save models in various formats:

from cobra.io import write_sbml_model, save_json_model, save_yaml_model write_sbml_model(model, "output.xml") # Preferred format save_json_model(model, "output.json") # For Escher compatibility save_yaml_model(model, "output.yml") # Human-readable

Model Structure and Components

Access and inspect model components:

Access components

model.reactions # DictList of all reactions model.metabolites # DictList of all metabolites model.genes # DictList of all genes

Get specific items by ID or index

reaction = model.reactions.get_by_id("PFK") metabolite = model.metabolites[0]

Inspect properties

print(reaction.reaction) # Stoichiometric equation print(reaction.bounds) # Flux constraints print(reaction.gene_reaction_rule) # GPR logic print(metabolite.formula) # Chemical formula print(metabolite.compartment) # Cellular location

Flux Balance Analysis (FBA)

Perform standard FBA simulation:

Basic optimization

solution = model.optimize() print(f"Objective value: {solution.objective_value}") print(f"Status: {solution.status}")

Access fluxes

print(solution.fluxes["PFK"]) print(solution.fluxes.head())

Fast optimization (objective value only)

objective_value = model.slim_optimize()

Change objective

model.objective = "ATPM" solution = model.optimize()

Parsimonious FBA (minimize total flux):

from cobra.flux_analysis import pfba solution = pfba(model)

Geometric FBA (find central solution):

from cobra.flux_analysis import geometric_fba solution = geometric_fba(model)

Flux Variability Analysis (FVA)

Determine flux ranges for all reactions:

from cobra.flux_analysis import flux_variability_analysis

Standard FVA

fva_result = flux_variability_analysis(model)

FVA at 90% optimality

fva_result = flux_variability_analysis(model, fraction_of_optimum=0.9)

Loopless FVA (eliminates thermodynamically infeasible loops)

fva_result = flux_variability_analysis(model, loopless=True)

FVA for specific reactions

fva_result = flux_variability_analysis( model, reaction_list=["PFK", "FBA", "PGI"] )

Gene and Reaction Deletion Studies

Perform knockout analyses:

from cobra.flux_analysis import ( single_gene_deletion, single_reaction_deletion, double_gene_deletion, double_reaction_deletion )

Single deletions

gene_results = single_gene_deletion(model) reaction_results = single_reaction_deletion(model)

Double deletions (uses multiprocessing)

double_gene_results = double_gene_deletion( model, processes=4 # Number of CPU cores )

Manual knockout using context manager

with model: model.genes.get_by_id("b0008").knock_out() solution = model.optimize() print(f"Growth after knockout: {solution.objective_value}")

Model automatically reverts after context exit

Growth Media and Minimal Media

Manage growth medium:

View current medium

print(model.medium)

Modify medium (must reassign entire dict)

medium = model.medium medium["EX_glc__D_e"] = 10.0 # Set glucose uptake medium["EX_o2_e"] = 0.0 # Anaerobic conditions model.medium = medium

Calculate minimal media

from cobra.medium import minimal_medium

Minimize total import flux

min_medium = minimal_medium(model, minimize_components=False)

Minimize number of components (uses MILP, slower)

min_medium = minimal_medium( model, minimize_components=True, open_exchanges=True )

Flux Sampling

Sample the feasible flux space:

from cobra.sampling import sample

Sample using OptGP (default, supports parallel processing)

samples = sample(model, n=1000, method="optgp", processes=4)

Sample using ACHR

samples = sample(model, n=1000, method="achr")

Validate samples

from cobra.sampling import OptGPSampler sampler = OptGPSampler(model, processes=4) sampler.sample(1000) validation = sampler.validate(sampler.samples) print(validation.value_counts()) # Should be all 'v' for valid

Production Envelopes

Calculate phenotype phase planes:

from cobra.flux_analysis import production_envelope

Standard production envelope

envelope = production_envelope( model, reactions=["EX_glc__D_e", "EX_o2_e"], objective="EX_ac_e" # Acetate production )

With carbon yield

envelope = production_envelope( model, reactions=["EX_glc__D_e", "EX_o2_e"], carbon_sources="EX_glc__D_e" )

Visualize (use matplotlib or pandas plotting)

import matplotlib.pyplot as plt envelope.plot(x="EX_glc__D_e", y="EX_o2_e", kind="scatter") plt.show()

Gapfilling

Add reactions to make models feasible:

from cobra.flux_analysis import gapfill

Prepare universal model with candidate reactions

universal = load_model("universal")

Perform gapfilling

with model: # Remove reactions to create gaps for demonstration model.remove_reactions([model.reactions.PGI])

# Find reactions needed
solution = gapfill(model, universal)
print(f"Reactions to add: {solution}")

10. Model Building

Build models from scratch:

from cobra import Model, Reaction, Metabolite

Create model

model = Model("my_model")

Create metabolites

atp_c = Metabolite("atp_c", formula="C10H12N5O13P3", name="ATP", compartment="c") adp_c = Metabolite("adp_c", formula="C10H12N5O10P2", name="ADP", compartment="c") pi_c = Metabolite("pi_c", formula="HO4P", name="Phosphate", compartment="c")

Create reaction

reaction = Reaction("ATPASE") reaction.name = "ATP hydrolysis" reaction.subsystem = "Energy" reaction.lower_bound = 0.0 reaction.upper_bound = 1000.0

Add metabolites with stoichiometry

reaction.add_metabolites({ atp_c: -1.0, adp_c: 1.0, pi_c: 1.0 })

Add gene-reaction rule

reaction.gene_reaction_rule = "(gene1 and gene2) or gene3"

Add to model

model.add_reactions([reaction])

Add boundary reactions

model.add_boundary(atp_c, type="exchange") model.add_boundary(adp_c, type="demand")

Set objective

model.objective = "ATPASE"

Common Workflows

Workflow 1: Load Model and Predict Growth

from cobra.io import load_model

Load model

model = load_model("ecoli")

Run FBA

solution = model.optimize() print(f"Growth rate: {solution.objective_value:.3f} /h")

Show active pathways

print(solution.fluxes[solution.fluxes.abs() > 1e-6])

Workflow 2: Gene Knockout Screen

from cobra.io import load_model from cobra.flux_analysis import single_gene_deletion

Load model

model = load_model("ecoli")

Perform single gene deletions

results = single_gene_deletion(model)

Find essential genes (growth < threshold)

essential_genes = results[results["growth"] < 0.01] print(f"Found {len(essential_genes)} essential genes")

Find genes with minimal impact

neutral_genes = results[results["growth"] > 0.9 * solution.objective_value]

Workflow 3: Media Optimization

from cobra.io import load_model from cobra.medium import minimal_medium

Load model

model = load_model("ecoli")

Calculate minimal medium for 50% of max growth

target_growth = model.slim_optimize() * 0.5 min_medium = minimal_medium( model, target_growth, minimize_components=True )

print(f"Minimal medium components: {len(min_medium)}") print(min_medium)

Workflow 4: Flux Uncertainty Analysis

from cobra.io import load_model from cobra.flux_analysis import flux_variability_analysis from cobra.sampling import sample

Load model

model = load_model("ecoli")

First check flux ranges at optimality

fva = flux_variability_analysis(model, fraction_of_optimum=1.0)

For reactions with large ranges, sample to understand distribution

samples = sample(model, n=1000)

Analyze specific reaction

reaction_id = "PFK" import matplotlib.pyplot as plt samples[reaction_id].hist(bins=50) plt.xlabel(f"Flux through {reaction_id}") plt.ylabel("Frequency") plt.show()

Workflow 5: Context Manager for Temporary Changes

Use context managers to make temporary modifications:

Model remains unchanged outside context

with model: # Temporarily change objective model.objective = "ATPM"

# Temporarily modify bounds
model.reactions.EX_glc__D_e.lower_bound = -5.0

# Temporarily knock out genes
model.genes.b0008.knock_out()

# Optimize with changes
solution = model.optimize()
print(f"Modified growth: {solution.objective_value}")

All changes automatically reverted

solution = model.optimize() print(f"Original growth: {solution.objective_value}")

Key Concepts

DictList Objects

Models use DictList objects for reactions, metabolites, and genes - behaving like both lists and dictionaries:

Access by index

first_reaction = model.reactions[0]

Access by ID

pfk = model.reactions.get_by_id("PFK")

Query methods

atp_reactions = model.reactions.query("atp")

Flux Constraints

Reaction bounds define feasible flux ranges:

Irreversible: lower_bound = 0, upper_bound > 0
Reversible: lower_bound < 0, upper_bound > 0
Set both bounds simultaneously with .bounds to avoid inconsistencies

Gene-Reaction Rules (GPR)

Boolean logic linking genes to reactions:

AND logic (both required)

reaction.gene_reaction_rule = "gene1 and gene2"

OR logic (either sufficient)

reaction.gene_reaction_rule = "gene1 or gene2"

Complex logic

reaction.gene_reaction_rule = "(gene1 and gene2) or (gene3 and gene4)"

Exchange Reactions

Special reactions representing metabolite import/export:

Named with prefix EX_ by convention
Positive flux = secretion, negative flux = uptake
Managed through model.medium dictionary

Best Practices

Use context managers for temporary modifications to avoid state management issues
Validate models before analysis using model.slim_optimize() to ensure feasibility
Check solution status after optimization - optimal indicates successful solve
Use loopless FVA when thermodynamic feasibility matters
Set fraction_of_optimum appropriately in FVA to explore suboptimal space
Parallelize computationally expensive operations (sampling, double deletions)
Prefer SBML format for model exchange and long-term storage
Use slim_optimize() when only objective value needed for performance
Validate flux samples to ensure numerical stability

Troubleshooting

Infeasible solutions: Check medium constraints, reaction bounds, and model consistency Slow optimization: Try different solvers (GLPK, CPLEX, Gurobi) via model.solver

Unbounded solutions: Verify exchange reactions have appropriate upper bounds Import errors: Ensure correct file format and valid SBML identifiers

References

For detailed workflows and API patterns, refer to:

references/workflows.md
Comprehensive step-by-step workflow examples
references/api_quick_reference.md
Common function signatures and patterns

Official documentation: https://cobrapy.readthedocs.io/en/latest/

cobrapy

Safety Notice

Copy this and send it to your AI assistant to learn

Load bundled test models

Load from files

Access components

Get specific items by ID or index

Inspect properties

Basic optimization

Access fluxes

Fast optimization (objective value only)

Change objective

Standard FVA

FVA at 90% optimality

Loopless FVA (eliminates thermodynamically infeasible loops)

FVA for specific reactions

Single deletions

Double deletions (uses multiprocessing)

Manual knockout using context manager

Model automatically reverts after context exit

View current medium

Modify medium (must reassign entire dict)

Calculate minimal media

Minimize total import flux

Minimize number of components (uses MILP, slower)

Sample using OptGP (default, supports parallel processing)

Sample using ACHR

Validate samples

Standard production envelope

With carbon yield

Visualize (use matplotlib or pandas plotting)

Prepare universal model with candidate reactions

Perform gapfilling

Create model

Create metabolites

Create reaction

Add metabolites with stoichiometry

Add gene-reaction rule

Add to model

Add boundary reactions

Set objective

Load model

Run FBA

Show active pathways

Load model

Perform single gene deletions

Find essential genes (growth < threshold)

Find genes with minimal impact

Load model

Calculate minimal medium for 50% of max growth

Load model

First check flux ranges at optimality

For reactions with large ranges, sample to understand distribution

Analyze specific reaction

Model remains unchanged outside context

All changes automatically reverted

Access by index

Access by ID

Query methods

AND logic (both required)

OR logic (either sufficient)

Complex logic

Source Transparency

Related Skills

research-lookup

statistical-analysis

market-research-reports

research-grants