How to Use PepFuNN Peptide Sequence Analysis Online

Try PepFuNN Peptide Sequence Analysis

Commercially Available Online Web Server

PepFuNN: Toolkit for Peptide In Silico Analysis

PepFuNN is a comprehensive, open-source Python toolkit designed to analyze the chemical space of peptide libraries and perform deep Structure-Activity Relationship (SAR) analysis. Developed by researchers at Novo Nordisk, PepFuNN serves as an evolved version of the original PepFun package, tailored specifically for pharmaceutical design campaigns.

By bridging bioinformatics and cheminformatics, PepFuNN enables scientists to study natural and modified peptides, including those with non-canonical amino acids (NCAAs), across sequence, structural, and interaction levels.

Key Innovations: Modular SAR Intelligence

PepFuNN is organized into five specialized modules that streamline the transition from raw peptide data to actionable design insights.

  • Sequence Analysis: Calculates fundamental physicochemical properties, net charges, and identifies empirical "alerts" for solubility and synthesis issues.

  • Similarity Mapping: Implements advanced similarity pipelines—including chemical fingerprints and deep-learned embeddings from Chemical Language Models (CLMs)—which consistently outperform traditional sequence alignments for partitioning peptide datasets.

  • Matched Molecular Pairs (MMP): Features a dedicated module for identifying matched pairs, allowing researchers to isolate the specific impact of single amino acid substitutions on biological activity.

  • Clustering & Representation: Groups peptides using molecular fingerprints or calculated descriptors to visualize and explore expansive chemical libraries.

  • Guided Library Design: Automatically generates peptide libraries based on specific constraints, such as amino acid frequency, macrocyclization requirements, or drug-likeness rules.

Performance Benchmarks

PepFuNN’s methodology has been validated across multiple peptide function prediction tasks, demonstrating the superiority of atom-level and chemical-aware representations over standard sequence metrics.

Metric

Traditional Alignment

PepFuNN CLM Embeddings

Key Finding

SAR Partitioning

Moderate

Superior

Fingerprints better separate canonical datasets

NCAA Extrapolation

Fails

Adequate

Joint models generalize to non-natural data

Atom-Level Prediction

N/A

Consistently Higher

Atom-mapped features outperform sequence-only

Processing Speed

Variable

High-Throughput

Capable of screening massive sequence sets

Scientific Breakthroughs in Peptide Engineering

Navigating the Non-Canonical Space

Peptide drug discovery increasingly relies on synthetic modifications to improve pharmacology. PepFuNN uses a public monomer dictionary and graph-based representations to capture the unique bonds and connections present in cyclic peptides and those containing non-canonical residues.

Empirical Solubility & Synthesis Alerts

PepFuNN integrates a robust set of rule-based alerts to flag potential experimental failures early.

  • Solubility Alerts: Warnings if hydrophobic amino acids exceed 45% or if the absolute total charge at pH 7 exceeds +1.

  • Synthesis Alerts: Detection of problematic motifs like Asp-Gly (DG) or Asp-Pro (DP), and identification of oxidation-sensitive residues like Methionine or Tryptophan.

AI-Assisted Extrapolation

By enriching canonical datasets with small proportions of non-canonical peptides, PepFuNN-driven models can build joint representations that effectively extrapolate from standard data to modified peptide sequences.

PepFuNN on Tamarind Bio: Integrated Peptide Workflows

Tamarind Bio provides a professional environment to execute PepFuNN’s Pythonic workflows without the need for manual library management or environment configuration.

  • No-Code Dashboard: Run similarity analysis and property calculations through a streamlined web interface.

  • Scalable Clustering: Process massive peptide libraries and generate SMILES-based descriptors using high-performance cloud compute.

How to Use PepFuNN on Tamarind Bio

  1. Access the Platform: Log in to tamarind.bio and select the PepFuNN tool from the toolkit.

  2. Input Sequences: Provide your peptide sequences in FASTA format or enter sequences directly using HELM notation for modified peptides.

  3. Choose Your Analysis:

    • Property Prediction: Calculate net charge, isoelectric point, and average hydrophobicity.

    • SAR Analysis: Extract matched molecular pairs to evaluate substitution effects.

  4. Run Solubility/Synthesis Checks: Identify "alerts" for sequence components likely to cause issues in the wet lab.

  5. Generate Conformers (Optional): Use the integrated ETKDG or MODELLER methodologies to predict the most probable 3D conformers for your sequences.

  6. Analyze & Export: View interactive similarity maps and property plots, then download the high-fidelity SMILES strings or PDB files for downstream docking and synthesis.

Source

Supporting 10,000+ scientists around the world,

from leading biotechs, and global biopharma