How to Use RNA-FM Online

Try RNA-FM

Commercially Available Online Web Server

RNA-FM: The First Foundation Model for the RNA Universe

RNA-FM (RNA Foundation Model) is a breakthrough computational model that leverages the power of self-supervised learning to decode the complex language of non-coding RNA (ncRNA). Developed by researchers at the Chinese University of Hong Kong and Fudan University, RNA-FM serves as a universal backbone for highly accurate RNA structure and function predictions.

By training on a massive dataset of 23 million unannotated sequences, RNA-FM overcomes the historical bottleneck of limited annotated data, providing a "One-For-All" solution for the RNA field.

Key Innovations: Self-Supervised RNA Intelligence

RNA-FM is designed to extract deep sequential and evolutionary information from raw RNA sequences without requiring expensive experimental labels.

  • Massive Scale Training: Trained on 23.7 million ncRNA sequences from the RNAcentral database, representing over 800,000 species.

  • Interpretable Representations: Generates rich L x 640 embedding matrices that implicitly capture structural, functional, and evolutionary patterns.

  • Zero-Label Discovery: Capable of inferring evolutionary trends in lncRNAs and SARS-CoV-2 variants purely from sequence data.

  • Unified Downstream Integration: A versatile foundation that improves performance across secondary structure prediction, 3D modeling, and protein-RNA binding preference.

  • Generalized Mastery: Successfully handles sequences outside its training domain, such as mRNA 5' UTRs and viral genomes.

Performance Benchmarks

RNA-FM consistently outperforms state-of-the-art (SOTA) thermodynamic and deep learning methods across diverse benchmarks.

Task

Metric

RNA-FM Result

Key Finding

Secondary Structure

F1 Score (ArchiveII)

0.941

Outperforms LinearFold by 30%


3D Closeness

Top-L Precision

0.66 (0.86 with TL)

33% gain over RNAcontact


Distance Prediction

Pixel Accuracy (PA)

86.13%

39% increase in R^2 over sequence alone


RBP-RNA Binding

Mean AUPRC

0.833

Comparable to in vivo experimental features


Structural & Functional Breakthroughs

RNA-FM bridges the gap between raw sequence data and biological insight, offering unparalleled precision in molecular modeling.

Advanced Structure Prediction & 3D Reconstruction

RNA-FM identifies vital base pairs that thermodynamic methods often miss. By providing more accurate constraints, it facilitates the reconstruction of 3D structures with significantly lower RMSD.

  • Secondary Structure: Superior performance on long sequences (>150nt) and low-redundancy datasets.

  • 3D Closeness: Replaces time-consuming Multiple Sequence Alignments (MSA) with pure single-sequence embeddings.

SARS-CoV-2 Evolution & Genome Study

RNA-FM has been applied to the entire SARS-CoV-2 genome to predict the secondary structures of regulatory elements like the 5' UTR and 3' UTR. It accurately traces the evolutionary trajectory from the Alpha type to the Omicron variants (BA.1 and BA.2) without any task-specific fine-tuning.

Gene Expression & Protein Interaction

The model excels at modeling protein-RNA binding preferences and predicting Mean Ribosome Loading (MRL). Even though trained on non-coding RNA, it effectively generalizes to mRNA untranslated regions, assisting in the study of gene expression regulation.

RNA-FM on Tamarind Bio: Accelerate Your Discovery

Tamarind Bio is a no-code bioinformatics platform that democratizes access to powerful tools like RNA-FM. By abstracting away the complexities of GPU orchestration, high-performance computing, and software dependencies, Tamarind allows scientists to focus on the biology rather than the DevOps.

  • No-Code Interface: Launch RNA-FM experiments through an intuitive web dashboard.

  • Scalable Infrastructure: Run massive virtual screens on hundreds of thousands of inputs using Tamarind’s secure cloud.

How to Use RNA-FM on Tamarind Bio

Researchers can leverage the "One-For-All" capabilities of RNA-FM through a streamlined, no-code workflow on the Tamarind platform to predict RNA structures and functions:

  • Access the Platform: Log in to the tamarind.bio website to access the suite of bioinformatics tools.

  • Select RNA-FM: From the library of available foundation models, choose the RNA-FM tool.

  • Input RNA Sequences: Provide the raw RNA sequence(s) you wish to analyze. The model is optimized for non-coding RNA (ncRNA) but can also process mRNA untranslated regions (UTRs).

  • Choose Your Task: Select the specific downstream application for your research:

    • Structure Prediction: Generate secondary structures, 3D closeness maps, or distance maps.

    • Functional Modeling: Predict protein-RNA binding preferences or gene expression regulation (MRL).

    • Evolutionary Analysis: Perform trajectory inference to visualize evolutionary trends for viral variants or lncRNAs.

  • Run and Extract Embeddings: The platform executes the transformer-based model to produce high-dimensional $L \times 640$ embeddings. These representations implicitly capture structural and evolutionary patterns without the need for Multiple Sequence Alignment (MSA).

  • Analyze the Results: Review the comprehensive reports, which may include F1 confidence scores for folding , interactive 2D/3D probability maps , or trajectory stream-plots for variant study.


    Source


Supporting 10,000+ scientists around the world,

from leading biotechs, and global biopharma