Therapeutic Protein Designer
AI-guided de novo protein design using RFdiffusion backbone generation, ProteinMPNN sequence optimization, and structure validation for therapeutic protein development.
KEY PRINCIPLES:
- Structure-first - Generate backbone geometry before sequence
- Target-guided - Design binders with target structure in mind
- Iterative validation - Predict structure to validate designs
- Developability-aware - Consider aggregation, immunogenicity, expression
- Evidence-graded - Grade designs by confidence metrics
- Actionable output - Provide sequences ready for experimental testing
- English-first queries - Always use English terms in tool calls
When to Use
Apply when user asks to:
- Design a protein binder, therapeutic protein, or scaffold
- Optimize a protein sequence for function
- Design a de novo enzyme
- Generate protein variants for target binding
Workflow Overview
Phase 1: Target Characterization
Get structure (PDB, EMDB cryo-EM, AlphaFold), identify binding epitope
Phase 2: Backbone Generation (RFdiffusion)
Define constraints, generate >= 5 backbones, filter by geometry
Phase 3: Sequence Design (ProteinMPNN)
Design >= 8 sequences per backbone, sample with temperature control
Phase 4: Structure Validation (ESMFold/AlphaFold2)
Predict structure, compare to backbone, assess pLDDT/pTM
Phase 5: Developability Assessment
Aggregation, pI, expression prediction
Phase 6: Report Synthesis
Ranked candidates, FASTA, experimental recommendations
Critical Requirements
Report-First Approach (MANDATORY)
- Create
[TARGET]_protein_design_report.mdfirst with section headers - Progressively update as designs are generated
- Output
[TARGET]_designed_sequences.fastaand[TARGET]_top_candidates.csv
Design Documentation (MANDATORY)
Every design MUST include: Sequence, Length, Target, Method, and Quality Metrics (pLDDT, pTM, MPNN score, binding prediction).
NVIDIA NIM Tools
| Tool | Purpose | Key Parameter |
|---|---|---|
NvidiaNIM_rfdiffusion | Backbone generation | diffusion_steps (NOT num_steps) |
NvidiaNIM_proteinmpnn | Sequence design | pdb_string (NOT pdb) |
NvidiaNIM_esmfold | Fast validation | sequence (NOT seq) |
NvidiaNIM_alphafold2 | High-accuracy validation | sequence, algorithm |
NvidiaNIM_esm2_650m | Sequence embeddings | sequences, format |
Common Parameter Mistakes
| Tool | Wrong | Correct |
|---|---|---|
NvidiaNIM_rfdiffusion | num_steps=50 | diffusion_steps=50 |
NvidiaNIM_proteinmpnn | pdb=content | pdb_string=content |
NvidiaNIM_esmfold | seq="MVLS..." | sequence="MVLS..." |
NvidiaNIM_alphafold2 | seq="MVLS..." | sequence="MVLS..." |
NVIDIA NIM Requirements
- API Key:
NVIDIA_API_KEYenvironment variable required - Rate limits: 40 RPM (1.5 second minimum between calls)
- AlphaFold2 may return 202 (polling required); RFdiffusion and ESMFold are synchronous
Supporting Tools
| Tool | Purpose | Key Parameters |
|---|---|---|
PDB_search_by_uniprot | Find PDB structures | uniprot_id |
PDB_get_structure | Download PDB file | pdb_id |
alphafold_get_prediction | Get AlphaFold DB structure | accession |
emdb_search | Search cryo-EM maps | query |
emdb_get_entry | Get entry details | entry_id |
UniProt_get_protein_sequence | Get target sequence | accession |
InterPro_get_protein_domains | Get domains | accession |
Evidence Grading
| Tier | Criteria |
|---|---|
| T1 (best) | pLDDT >85, pTM >0.8, low aggregation, neutral pI |
| T2 | pLDDT >75, pTM >0.7, acceptable developability |
| T3 | pLDDT >70, pTM >0.65, developability concerns |
| T4 | Failed validation or major developability issues |
Completeness Checklist
- Target structure obtained (PDB or predicted)
- Binding epitope identified
- >= 5 backbones generated, top 3-5 selected
- >= 8 sequences per backbone, MPNN scores reported
- All sequences validated (ESMFold), pLDDT/pTM reported, >= 3 passing
- Developability assessed (aggregation, pI, expression)
- Ranked candidate list, FASTA file, experimental recommendations
Reference Files
- DESIGN_PROCEDURES.md - Phase-by-phase code examples, sampling parameters, fallback chains
- TOOLS_REFERENCE.md - Complete tool documentation with code examples
- EXAMPLES.md - Sample design workflows and outputs
- CHECKLIST.md - Detailed phase checklists and quality metrics
- design_templates.md - Report templates and output format examples