Use PALM Online

Commercially Available PALM No-Code Web Server

Try PALM

PALM: Accurate Peptide Aggregation & APR Prediction

Predict peptide aggregation and map aggregation-prone regions at single-residue resolution using pretrained protein language model embeddings.

Dual-Level Prediction: Accurately classifies sequence-level peptide amyloidogenesis while mapping precise residue-level aggregation-prone regions (APRs).
State-of-the-Art Transfer Learning: Inherits rich structural and sequence contexts by leveraging pretrained ESM2 language model embeddings.
Tailored Light Attention: Extracts intricate local sequence motifs via a custom 1D convolutional Aggregation Predictor Module (APM).

How PALM Works

PALM (Predicting Aggregation with Language Model embeddings) provides a reliable deep-learning framework to overcome the bottleneck of resource-intensive wet-lab peptide aggregation assays.

Pretrained Protein Language Model Embeddings

PALM utilizes the pretrained ESM2 protein language model to extract rich representation tensors from raw amino acid sequences. While larger language models often experience a performance drop on specialized downstream tasks that diverge from masked language modeling, PALM strategically implements an optimized ESM2 8M parameter backbone. This approach minimizes model size while delivering statistically superior residue-level predictive performance over larger 35M, 150M, and 650M parameter counterparts.

The Aggregation Predictor Module (APM)

Once the sequence is featurized into residue embeddings, it enters the Aggregation Predictor Module:

Local Pattern Extraction: Two independent, one-dimensional convolutions (kernel size = 5, stride = 1) process the sequence dimension to generate attention weights and feature values.
Residue Aggregation Scoring: A multi-layer perceptron (MLP) head operates on each position to generate an independent residue aggregation score ($r_i$) scaled between 0 and 1.
Interpretability via Weighted Mean: Rather than taking a simple arithmetic mean, individual residue scores are aggregated into an overall sequence score ($s$) using an attention-weighted mean. This mathematically isolates the explicit sequence-level contributions down to single-residue resolution without needing residue-level training annotations.

Advanced Sequence-Length Data Augmentation

Standard amyloid datasets, like WaltzDB-2.0, are heavily restricted to short six-amino-acid hexapeptides. To successfully bridge the sequence space gap and process natural or therapeutic peptides of highly variable lengths, PALM implements a unique data augmentation framework. Prior to training, hexapeptides are contextually padded at both the N- and C-termini using a non-hydrophobic residue distribution. This shifts the training sequences out of isolated length clusters and mirrors the embedding space distribution of real multi-length biological data, yielding unprecedented accuracy on natural long-sequence variants.

Unmatched Benchmarking Performance

PALM sets a high standard for unsupervised discovery of aggregation-prone regions and general peptide classification, outperforming traditional statistical and machine-learning models:

Serrano158 Benchmark (Sequence-level ROC AUC): 0.908 ± 0.005
- PALM significantly outperforms AggreScan (0.801) and ANuPP (0.857), establishing top-tier sequence classification capabilities.
AmyPro22 Benchmark (Residue-level ROC AUC): 0.678 ± 0.02
- PALM accurately targets true structural amyloid-prone hot spots, outperforming complex servers like AggreProt (0.639) and TANGO (0.645).

Scaled Architectures for Demanding Mutational Screens

While standard transfer learning models struggle to register the subtle physics of single-point mutations—such as the 13 familial Alzheimer's disease (fAD) substitutions in the Amyloid-beta (Aβ42) peptide—the PALM architecture scales dynamically with user needs. When retrained on massively parallel high-throughput selection datasets (like NNK1-3), PALM variants accurately resolve localized mutational shifts within 1–3 residues, turning complex variants into clearly identifiable aggregation peaks.

What is Tamarind Bio?

Tamarind Bio is a pioneering no-code bioinformatics platform built to democratize access to powerful computational tools for life scientists and researchers. Recognizing that many cutting-edge machine learning models are often difficult to deploy and use, Tamarind provides an intuitive, web-based environment that completely abstracts away the complexities of high-performance computing, software dependencies, and command-line interfaces.

The platform is designed to provide easy access to biologists, chemists, and other researchers who may not have a background in programming or cloud infrastructure but want to run experimental models with their data. By handling the technical heavy lifting, Tamarind empowers researchers to concentrate on their scientific questions and accelerate the pace of discovery. The Tamarind team holds information/data security as a top priority, ensuring your sequence data is fully protected.

How to Use PALM on Tamarind Bio

Leveraging PALM’s advanced pLM architecture to predict amyloid-prone hot spots or screen candidate therapeutics is streamlined on Tamarind Bio:

Access the Platform: Log in to your secure workspace on the Tamarind Bio website.
Select PALM: Navigate through the computational biology suite and select the PALM (Predicting Aggregation with Language Model embeddings) tool.
Input Your Sequence Data: Upload or paste the raw amino acid sequence of the peptide or candidate therapeutic target you wish to screen.
Choose Your Execution Mode: * Select Base PALM (optimized on augmented WaltzDB data) for robust general classification and baseline APR localization.
- Select PALM (NNK1-3) for high-throughput mutational screening tasks and to map sensitive individual variant substitutions.
Run the Prediction Engine: Submit your job. Tamarind Bio scales cloud infrastructure automatically, mapping the sequence through the ESM2 embedding layers and the APM attention kernels smoothly.
Analyze and Export Interactive Outputs: Instantly view individual sequence classification probabilities alongside continuous single-residue visualization charts to precisely pinpoint biological APR boundaries. Download publication-ready tables and prediction profiles to guide your downstream rational peptide design or in vitro verification assays.

Source

Supporting 10,000+ scientists around the world,

from leading biotechs, and global biopharma

Get started

Book a demo