Use LigandMPNN Online

Commercially Available LigandMPNN No-Code Web Server

Try LigandMPNN

LigandMPNN: Atomic Context-Conditioned Protein Sequence Design

LigandMPNN is a powerful deep learning framework designed to solve the protein sequence design problem by explicitly modeling non-protein atomic contexts. Developed by the Baker Lab at the University of Washington, it generalizes the state-of-the-art ProteinMPNN architecture to incorporate small molecules, nucleotides, metals, and other non-protein components.

By reasoning about the full biomolecular system, LigandMPNN enables the precise design of enzymes, biosensors, and binders that require high chemical complementarity between a protein and its target.

Key Innovations: Explicit Non-Protein Modeling

LigandMPNN overcomes the critical limitation of previous models that only considered protein backbone coordinates.

  • Atomic Contextual Graphs: Operates on three distinct graphs: a protein-only graph, an intra-ligand graph (element types and inter-atomic distances), and a protein-ligand graph that transfers information from ligand atoms to protein residues.

  • Message Passing for Richer Data: Intra-ligand message passing increases the richness of information transferred to the protein, allowing the model to "see" specific chemical environments.

  • Sidechain Conformation Generation: Unlike models that only output sequences, LigandMPNN predicts sidechain torsion angles (x1–x4), facilitating detailed evaluation of binding interactions.

  • Fast and Scalable Architecture: The neural network features 2.62 million parameters and remains highly efficient, designing sequences for a 100-residue protein in approximately 0.9 seconds on a single CPU.

  • Generalization Mastery: Due to chemical similarities (e.g., carbon, oxygen, nitrogen), the model can generalize from sidechain atoms to novel small-molecule contexts even without specific training data.

Performance Benchmarks: Superior Recovery Accuracy

LigandMPNN significantly outperforms both physics-based methods (Rosetta) and predecessor models (ProteinMPNN) in native sequence recovery for residues interacting with non-protein atoms.

Context Type

Rosetta Recovery

ProteinMPNN Recovery

LigandMPNN Result

Key Finding

Small Molecules

50.4%

50.5%

63.3%

Significant jump in binding site accuracy


Nucleotides

35.2%

34.0%

50.5%

Overcomes large-scale atom modeling challenges


Metals

36.0%

40.6%

77.5%

Near-perfect recovery of metal coordination sites


Sidechain Packing

76.0% X1

83.3% X1

86.1% X1

Higher recovery of native torsion angles


Scientific Breakthroughs in Interaction Design

Rescuing Non-Functional Designs

Experimental characterization has shown that LigandMPNN can "rescue" weak or non-binding proteins designed by older methods. For example, LigandMPNN redesigns of rocuronium and cholic acid binders successfully introduced new sidechain-ligand hydrogen bonds that were missing in original Rosetta designs.

Enzyme and DNA Binder Engineering

LigandMPNN has been experimentally validated across more than 100 protein-DNA and protein-small molecule interfaces. It has successfully generated sequence-specific DNA-binding proteins that recognize targets in the major groove, with crystal structures (e.g., PDB ID 8TAC) matching the design models with high fidelity.

Multi-State and Symmetric Design

Utilizing a random autoregressive decoding scheme, the model facilitates the design of complex architectures, including symmetric protein assemblies and multi-state hinge proteins.

LigandMPNN on Tamarind Bio: Comprehensive Atomic Design

Tamarind Bio provides an optimized, no-code environment to deploy LigandMPNN for high-throughput design of enzymes, sensors, and therapeutic binders.

  • Integrated Non-Protein Handling: Seamlessly manage ligands, ions, and nucleic acids in your design loop without manual force field parameterization.

  • Interactive Confidence Reports: Leverage per-residue predicted confidence scores that correlate strongly with actual sequence recovery accuracy to identify the most promising leads.

How to Use LigandMPNN on Tamarind Bio

  1. Access the Toolkit: Log in to tamarind.bio and select the LigandMPNN tool.

  2. Upload Molecular Complex: Provide a PDB file containing both the protein backbone and the atomic context (small molecule, ion, or nucleotide).

  3. Define Designable Residues: Use the interactive 3D viewer to select which residues should be redesigned (typically those within 5.0 Å of the ligand).

  4. Set Modeling Constraints: Optionally input specific sidechain coordinates to stabilize functional sites of interest.

  5. Run Design Trajectories: The platform executes message-passing cycles to generate optimized sequences and their corresponding sidechain conformations.

  6. Evaluate & Validate: Review sequence logos and confidence maps, then download top candidates for experimental synthesis.

Source

Supporting 10,000+ scientists around the world,

from leading biotechs, and global biopharma