How to Use RFdiffusion2 Online

Try RFdiffusion2

Commercially Available Online Web Server

RFdiffusion2: A New Era of Enzyme Design

Scientists have introduced RFdiffusion2, a new deep generative method that dramatically improves the process of designing novel enzymes. The traditional approach to enzyme design, even with advanced tools, has been limited by two key factors: the need to pre-specify the location of catalytic residues in the protein sequence and the requirement for computationally expensive "inverse rotamer" sampling to determine the residue's backbone from the active site. RFdiffusion2 solves both of these problems, enabling enzyme design directly from a minimal, sequence-agnostic description of the functional group locations.

How RFdiffusion2 Works

RFdiffusion2 is an extension of the RoseTTAFold All-Atom (RFdiffusionAA) model, trained with a simpler framework called flow matching for improved stability and efficiency. Its key innovation is the ability to condition on motifs at the atomic level, without needing to know their sequence indices or full rotamer conformations beforehand.

  • Atomic Motif Conditioning: Instead of being constrained to the residue-level representation of previous methods, RFdiffusion2 can generate protein structures by conditioning on the precise coordinates of individual atoms, such as the side chain functional groups in an active site.

  • Unindexed Motifs: The model learns to infer the optimal placement of catalytic residues along the protein sequence on its own, removing the need for a pre-specified sequence index. This vastly expands the possible design solutions.

  • Enhanced Control: RFdiffusion2 also provides additional controls for designers, including the ability to specify the solvent accessibility of ligand atoms and to define a general orientation for the generated protein scaffold relative to the active site.

The model was successfully benchmarked on a new set of 41 diverse active sites, where it outperformed the previous state-of-the-art RFdiffusion, successfully scaffolding all 41 sites compared to only 16.

Comparing RFdiffusion2 & RFdiffusion


RFdiffusion

RFdiffusion2

Active Site Specification

Required specification at the residue level, often requiring complex inverse rotamer calculations.

Allows for atomic-level or partial ligand specification, eliminating the need for complex calculations.

Design Success Rate (AME Benchmark)

Successful in generating scaffolds for only 16 out of 41 diverse enzyme active sites.

Successfully generated scaffolds for all 41 sites, achieving a 100% success rate.

Flexibility & Control

More rigid design, requiring pre-specified sequence indices for catalytic residues.

Sequence-agnostic design with added features like stochastic centering and partial ligand specification for greater control.

Training & Architecture

Based on the original RFdiffusion framework.

Utilizes a more efficient flow matching training framework and the RoseTTAFold All-Atom neural network architecture to model side chain postures directly.

We encourage you to test both models out to see which works best for your specific experiment.

What is Tamarind Bio?

Tamarind Bio is a pioneering no-code bioinformatics platform built to democratize access to powerful computational tools for life scientists and researchers. Recognizing that many cutting-edge machine learning models are often difficult to deploy and use, Tamarind provides an intuitive, web-based environment that completely abstracts away the complexities of high-performance computing, software dependencies, and command-line interfaces.

The platform is designed provide easy access to biologists, chemists, and other researchers who may not have a background in programming or cloud infrastructure but want to run experimental models with their data. Key features include a user-friendly graphical interface for setting up and launching experiments, a robust API for integration into existing research pipelines, and an automated system for managing and scaling computational resources. By handling the technical heavy lifting, Tamarind empowers researchers to concentrate on their scientific questions and accelerate the pace of discovery.

Accelerating Discovery with RFdiffusion2 on Tamarind Bio

Using RFdiffusion2 on a platform like Tamarind would revolutionize enzyme design, making it more accessible and efficient for researchers.

  • Design from First Principles: Instead of relying on existing protein scaffolds, researchers could start directly from a desired chemical reaction mechanism. With RFdiffusion2 on Tamarind, they could provide the atomic coordinates of a catalytic site and let the model design a novel protein scaffold around it, accelerating the creation of enzymes for new reactions.

  • High-Throughput and Automation: The model's ability to automatically infer rotamers and sequence indices removes a major computational bottleneck, enabling the rapid generation of a large number of diverse design candidates. Researchers could then use the platform to screen these designs with tools like LigandMPNN and AlphaFold3 to select the most promising candidates for experimental validation.

  • Experimental Success: The paper demonstrates that RFdiffusion2 can generate functional enzymes in vitro for retroaldolases, cysteine hydrolases, and zinc hydrolases, often finding active catalysts by testing less than 96 sequences. This high success rate makes the experimental validation process significantly more efficient.

How to Use RFdiffusion2 on Tamarind Bio

To leverage RFdiffusion2's power, a researcher could follow this streamlined workflow:

  1. Access the Platform: Begin by logging in to the tamarind.bio website.

  2. Select RFdiffusion2: From the list of available computational models, choose the RFdiffusion2 tool.

  3. Define a Theozyme: Provide the atomic coordinates of the catalytic residues and any associated ligands or cofactors.

  4. Specify Constraints: Optionally, use the platform's interface to set conditions like the desired solvent exposure of specific atoms (RASA) or the overall orientation of the protein scaffold using an ORI pseudo-atom.

  5. Generate Designs: The platform would run RFdiffusion2 to produce a set of novel protein structures that scaffold the defined active site.

  6. Validate and Refine: The designs could then be automatically analyzed for sequence and structural fitness using integrated tools, providing a prioritized list of candidates for a researcher to take into the lab.

Source