How to Use PepMLM Online

Try PepMLM

Commercially Available Online Web Server

PepMLM: A New Tool for Therapeutic Peptide Design

PepMLM, a novel algorithm for designing linear peptide binders to a target protein using only its amino acid sequence. This approach is a significant step forward in protein engineering, as it bypasses the need for stable 3D structures, a requirement that has limited previous methods like RFDiffusion to structured targets. PepMLM enables the design of binders for "undruggable" targets, such as transcription factors and fusion oncoproteins, which are often conformationally disordered.

How PepMLM Works

PepMLM is built on the ESM-2 masked language model (MLM), which is fine-tuned to understand the relationship between a target protein's sequence and its binding peptides.

  • Span Masking: The model is trained using a unique span masking strategy where the entire peptide binder sequence is positioned at the C-terminus of the target protein sequence and is fully masked. This forces the model to learn to reconstruct the full binding region, rather than just individual amino acids.

  • Performance: PepMLM achieves low perplexity scores, which indicate high confidence in its predictions, and its generated binders closely mimic the perplexity distribution of real binders.

  • In Silico Benchmarking: When compared with RFDiffusion, PepMLM demonstrated a higher hit rate for designing peptides for structured targets in an in silico benchmark using AlphaFold-Multimer.

What is Tamarind Bio?

Tamarind Bio is a pioneering no-code bioinformatics platform built to democratize access to powerful computational tools for life scientists and researchers. Recognizing that many cutting-edge machine learning models are often difficult to deploy and use, Tamarind provides an intuitive, web-based environment that completely abstracts away the complexities of high-performance computing, software dependencies, and command-line interfaces.

The platform is designed provide easy access to biologists, chemists, and other researchers who may not have a background in programming or cloud infrastructure but want to run experimental models with their data. Key features include a user-friendly graphical interface for setting up and launching experiments, a robust API for integration into existing research pipelines, and an automated system for managing and scaling computational resources. By handling the technical heavy lifting, Tamarind empowers researchers to concentrate on their scientific questions and accelerate the pace of discovery.

Accelerating Discovery with PepMLM on Tamarind Bio

Using PepMLM on a platform like Tamarind could accelerate the discovery of therapeutics for a variety of challenging targets:

  • Targeting "Undruggable" Proteins: By designing peptides from sequence alone, PepMLM can be used to generate binders for disordered proteins that lack a stable 3D structure, opening up new therapeutic possibilities for diseases like Huntington's disease.

  • In Vitro Validation: The peptides designed by PepMLM have been experimentally validated through fusion to E3 ubiquitin ligase domains, demonstrating the ability to degrade disease-related proteins, including mutant huntingtin and viral phosphoproteins from viruses like Nipah and Hendra.

  • Streamlined Workflow: On Tamarind, researchers can easily use PepMLM to generate libraries of potential peptide binders, screen them computationally using tools like AlphaFold-Multimer, and then prioritize the most promising candidates for experimental testing.

How to Use PepMLM on Tamarind Bio

To leverage PepMLM's power on a platform like Tamarind, a researcher could follow this streamlined workflow:

  1. Access the Platform: Begin by logging in to the tamarind.bio website.

  2. Select PepMLM: From the list of available computational models, choose the PepMLM tool.

  3. Input Target Sequence: Provide the amino acid sequence of the target protein you wish to bind.

  4. Define Binder Length: Specify the desired length of the peptide binder you want to generate.

  5. Generate Peptides: The platform would run the PepMLM model to generate a diverse library of peptide sequences. You can choose to generate a single peptide using greedy decoding or multiple peptides using top-k sampling for increased diversity.

  6. Evaluate and Select: The generated peptides can be automatically evaluated for perplexity to indicate model confidence, and further assessed using structural metrics like ipTM and pLDDT through tools like AlphaFold-Multimer. You can then select the most promising candidates for experimental validation.

Source