Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2024 Nov 22;26(1):bbae682.
doi: 10.1093/bib/bbae682.

R3Design: deep tertiary structure-based RNA sequence design and beyond

Affiliations

R3Design: deep tertiary structure-based RNA sequence design and beyond

Cheng Tan et al. Brief Bioinform. .

Abstract

The rational design of Ribonucleic acid (RNA) molecules is crucial for advancing therapeutic applications, synthetic biology, and understanding the fundamental principles of life. Traditional RNA design methods have predominantly focused on secondary structure-based sequence design, often neglecting the intricate and essential tertiary interactions. We introduce R3Design, a tertiary structure-based RNA sequence design method that shifts the paradigm to prioritize tertiary structure in the RNA sequence design. R3Design significantly enhances sequence design on native RNA backbones, achieving high sequence recovery and Macro-F1 score, and outperforming traditional secondary structure-based approaches by substantial margins. We demonstrate that R3Design can design RNA sequences that fold into the desired tertiary structures by validating these predictions using advanced structure prediction models. This method, which is available through standalone software, provides a comprehensive toolkit for designing, folding, and evaluating RNA at the tertiary level. Our findings demonstrate R3Design's superior capability in designing RNA sequences, which achieves around $44\%$ in terms of both recovery score and Macro-F1 score in multiple datasets. This not only denotes the accuracy and fairness of the model but also underscores its potential to drive forward the development of innovative RNA-based therapeutics and to deepen our understanding of RNA biology.

Keywords: RNA; artificial intelligence; biomolecular engineering; graph neural networks; inverse folding.

PubMed Disclaimer

Conflict of interest statement

None declared.

Figures

Figure 1
Figure 1
Violin plot on the sequence-level metrics across our benchmark, Rfam, and RNA-Puzzles datasets. (A) The first row shows the recovery rate comparison on the benchmark dataset with short, medium, and long splits. (B) The second row shows the Macro F1 comparison on the benchmark dataset with short, medium, and long splits. (C) The third row shows the recovery rate comparison on the complete benchmark dataset, Rfam, and RNA-Puzzles datasets. (D) The fourth row shows the Macro F1 comparison on the complete benchmark dataset, Rfam, and RNA-Puzzles datasets.
Figure 2
Figure 2
Performance comparison of R3Design and other baseline models on folding secondary structure. The accuracy is the predicted secondary structure based on the designed sequences, and the recovered sequence rate is the foldable sequence rate. (A) The metrics on the Rfam dataset. (B) The metrics on the RNA-Puzzles dataset.
Figure 3
Figure 3
Comparative analysis of tertiary structure predictions for RNA sequences designed by R3Design. For each RNA molecule, we display both the native sequence and the sequence designed by R3Design. Tertiary structures predicted from these sequences using DRfold, trRosettaRNA, AlphaFold3, and RoseTTAFoldNA are shown. RMSD values are calculated to assess the accuracy of the predicted structures relative to the actual native tertiary structures. We highlight the different bases in the designed sequences in red.
Figure 4
Figure 4
The detailed modular architecture of the R3Design software. The pipeline comprises three main components: (i) RNA sequence redesign using R3Design, based on the input tertiary structure, (ii) comprehensive evaluations at the sequence level, secondary structure level, and tertiary structure level, and (iii) the final output, which includes the optimized RNA sequences along with their corresponding evaluation metrics.
Figure 5
Figure 5
Overall framework of R3Design. (A) The overview of the R3Design pipeline. (B) The graph-based RNA tertiary structure modeling. (C) The secondary structure auxiliary task. (D) The model architecture of the backbone encoder and the sequence decoder. (E) The iterative sequence refinement process.

Similar articles

Cited by

References

    1. Kaushik K, Sivadas A, Vellarikkal SK. et al. . RNA secondary structure profiling in zebrafish reveals unique regulatory features. BMC Genomics 2018;19:1–17. 10.1186/s12864-018-4497-0. - DOI - PMC - PubMed
    1. Guo P, Coban O, Snead NM. et al. . Engineering RNA for targeted siRNA delivery and medical application. Adv Drug Deliv Rev 2010;62:650–66. 10.1016/j.addr.2010.03.008. - DOI - PMC - PubMed
    1. Sloma MF, Mathews DH. Exact calculation of loop formation probability identifies folding motifs in RNA secondary structures. RNA 2016;22:1808–18. 10.1261/rna.053694.115. - DOI - PMC - PubMed
    1. Feingold EA, Pachter L. The ENCODE (Encyclopedia of DNA Elements) project. Science 2004;306:636–40. 10.1126/science.1105136. - DOI - PubMed
    1. Gstir R, Schafferer S, Scheideler M. et al. . Generation of a neuro-specific microarray reveals novel differentially expressed noncoding RNAs in mouse models for neurodegenerative diseases. RNA 2014;20:1929–43. 10.1261/rna.047225.114. - DOI - PMC - PubMed