How to Use PepFuNN Peptide Sequence Analysis Online
Try PepFuNN Peptide Sequence Analysis
Commercially Available Online Web Server
PepFuNN: Toolkit for Peptide In Silico Analysis
PepFuNN is a comprehensive, open-source Python toolkit designed to analyze the chemical space of peptide libraries and perform deep Structure-Activity Relationship (SAR) analysis. Developed by researchers at Novo Nordisk, PepFuNN serves as an evolved version of the original PepFun package, tailored specifically for pharmaceutical design campaigns.
By bridging bioinformatics and cheminformatics, PepFuNN enables scientists to study natural and modified peptides, including those with non-canonical amino acids (NCAAs), across sequence, structural, and interaction levels.
Key Innovations: Modular SAR Intelligence
PepFuNN is organized into five specialized modules that streamline the transition from raw peptide data to actionable design insights.
Sequence Analysis: Calculates fundamental physicochemical properties, net charges, and identifies empirical "alerts" for solubility and synthesis issues.
Similarity Mapping: Implements advanced similarity pipelines—including chemical fingerprints and deep-learned embeddings from Chemical Language Models (CLMs)—which consistently outperform traditional sequence alignments for partitioning peptide datasets.
Matched Molecular Pairs (MMP): Features a dedicated module for identifying matched pairs, allowing researchers to isolate the specific impact of single amino acid substitutions on biological activity.
Clustering & Representation: Groups peptides using molecular fingerprints or calculated descriptors to visualize and explore expansive chemical libraries.
Guided Library Design: Automatically generates peptide libraries based on specific constraints, such as amino acid frequency, macrocyclization requirements, or drug-likeness rules.
Performance Benchmarks
PepFuNN’s methodology has been validated across multiple peptide function prediction tasks, demonstrating the superiority of atom-level and chemical-aware representations over standard sequence metrics.
Metric | Traditional Alignment | PepFuNN CLM Embeddings | Key Finding |
SAR Partitioning | Moderate | Superior | Fingerprints better separate canonical datasets |
NCAA Extrapolation | Fails | Adequate | Joint models generalize to non-natural data |
Atom-Level Prediction | N/A | Consistently Higher | Atom-mapped features outperform sequence-only |
Processing Speed | Variable | High-Throughput | Capable of screening massive sequence sets |
Scientific Breakthroughs in Peptide Engineering
Navigating the Non-Canonical Space
Peptide drug discovery increasingly relies on synthetic modifications to improve pharmacology. PepFuNN uses a public monomer dictionary and graph-based representations to capture the unique bonds and connections present in cyclic peptides and those containing non-canonical residues.
Empirical Solubility & Synthesis Alerts
PepFuNN integrates a robust set of rule-based alerts to flag potential experimental failures early.
Solubility Alerts: Warnings if hydrophobic amino acids exceed 45% or if the absolute total charge at pH 7 exceeds +1.
Synthesis Alerts: Detection of problematic motifs like Asp-Gly (DG) or Asp-Pro (DP), and identification of oxidation-sensitive residues like Methionine or Tryptophan.
AI-Assisted Extrapolation
By enriching canonical datasets with small proportions of non-canonical peptides, PepFuNN-driven models can build joint representations that effectively extrapolate from standard data to modified peptide sequences.
PepFuNN on Tamarind Bio: Integrated Peptide Workflows
Tamarind Bio provides a professional environment to execute PepFuNN’s Pythonic workflows without the need for manual library management or environment configuration.
No-Code Dashboard: Run similarity analysis and property calculations through a streamlined web interface.
Scalable Clustering: Process massive peptide libraries and generate SMILES-based descriptors using high-performance cloud compute.
How to Use PepFuNN on Tamarind Bio
Access the Platform: Log in to tamarind.bio and select the PepFuNN tool from the toolkit.
Input Sequences: Provide your peptide sequences in FASTA format or enter sequences directly using HELM notation for modified peptides.
Choose Your Analysis:
Property Prediction: Calculate net charge, isoelectric point, and average hydrophobicity.
SAR Analysis: Extract matched molecular pairs to evaluate substitution effects.
Run Solubility/Synthesis Checks: Identify "alerts" for sequence components likely to cause issues in the wet lab.
Generate Conformers (Optional): Use the integrated ETKDG or MODELLER methodologies to predict the most probable 3D conformers for your sequences.
Analyze & Export: View interactive similarity maps and property plots, then download the high-fidelity SMILES strings or PDB files for downstream docking and synthesis.