How to Use COMPSS Protein Metrics Online

Try COMPSS Protein Metrics

Commercially Available Online Web Server

COMPSS: A Framework for Designing Functional Enzymes

Scientists have developed COMPSS (Composite Computational Metrics for Protein Sequence Selection), a powerful new framework for filtering and selecting active enzyme sequences generated by computational models. The core challenge in protein engineering is that while generative models can produce countless novel sequences, predicting which ones will actually fold correctly and be functional remains difficult. COMPSS addresses this by validating a diverse set of computational metrics and combining them into a filter that significantly improves the success rate of finding active enzymes. In experiments, this filter improved the rate of experimental success by 50-150%.

How the COMPSS Pipeline Works

The COMPSS framework uses a multi-step approach to identify promising sequences for experimental testing.

  1. Alignment-Based and Sequence-Based Metrics: The pipeline first uses quick, sequence-based quality checks to remove sequences that are unlikely to be functional, such as those with long repeats or transmembrane domains. It then scores sequences with protein language models like ESM-1v to pre-filter for promising candidates.

  2. Structure-Based Metrics: For sequences that pass the initial filters, COMPSS uses more computationally expensive, structure-based metrics. It predicts the protein's 3D structure using tools like AlphaFold2 and then evaluates it with inverse folding models, such as ProteinMPNN, which score how well a sequence fits a given structure.

  3. Composite Filter: By combining these metrics, COMPSS creates a robust filter that is more effective than any single metric alone. The study found that while some metrics were predictive of activity, no single metric was sufficient to handle the multiple failure modes that can occur in protein expression and folding.

What is Tamarind Bio?

Tamarind Bio is a pioneering no-code bioinformatics platform built to democratize access to powerful computational tools for life scientists and researchers. Recognizing that many cutting-edge machine learning models are often difficult to deploy and use, Tamarind provides an intuitive, web-based environment that completely abstracts away the complexities of high-performance computing, software dependencies, and command-line interfaces.

The platform is designed provide easy access to biologists, chemists, and other researchers who may not have a background in programming or cloud infrastructure but want to run experimental models with their data. Key features include a user-friendly graphical interface for setting up and launching experiments, a robust API for integration into existing research pipelines, and an automated system for managing and scaling computational resources. By handling the technical heavy lifting, Tamarind empowers researchers to concentrate on their scientific questions and accelerate the pace of discovery.

Accelerating Discovery with COMPSS on Tamarind Bio

Using the COMPSS framework on a platform like Tamarind can dramatically accelerate enzyme discovery and protein engineering campaigns.

  • Efficient Screening: Instead of naively generating and testing thousands of sequences with a low success rate, researchers can use the COMPSS filter to screen vast libraries of generated sequences and identify the most likely functional candidates for experimental validation. This greatly reduces the time and cost associated with wet-lab work.

  • High-Quality Libraries: The framework can be used to generate diverse libraries of active enzymes, with the study achieving success rates as high as 100% in some cases. This provides a powerful starting point for further engineering and optimization through directed evolution.

  • Accessible Workflow: By integrating the COMPSS workflow into a no-code platform, Tamarind democratizes access to advanced protein engineering techniques. Researchers can simply upload their sequences and the platform handles the computationally intensive tasks of running multiple metrics and providing a prioritized list of candidates for testing.

How to Use COMPSS on Tamarind Bio

  1. Access the Platform: Log in to the tamarind.bio website.

  2. Select COMPSS: From the list of available computational models, choose the COMPSS tool.

  3. Generate Sequences: Start with sequences from a generative model like ProteinGAN, ESM-MSA, or ASR.

  4. Apply Filters: Run your sequences through the multi-step COMPSS filter on the platform. The system automatically performs quality checks, calculates ESM-1v scores, and for the top candidates, predicts structures and calculates ProteinMPNN scores.

  5. Select Candidates: The platform provides a ranked list of sequences, allowing you to select those with the highest probability of being active, enriching your experimental success rate.

Source