Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Review
. 2023 May 19;24(3):bbad153.
doi: 10.1093/bib/bbad153.

Machine learning for RNA 2D structure prediction benchmarked on experimental data

Affiliations
Review

Machine learning for RNA 2D structure prediction benchmarked on experimental data

Marek Justyna et al. Brief Bioinform. .

Abstract

Since the 1980s, dozens of computational methods have addressed the problem of predicting RNA secondary structure. Among them are those that follow standard optimization approaches and, more recently, machine learning (ML) algorithms. The former were repeatedly benchmarked on various datasets. The latter, on the other hand, have not yet undergone extensive analysis that could suggest to the user which algorithm best fits the problem to be solved. In this review, we compare 15 methods that predict the secondary structure of RNA, of which 6 are based on deep learning (DL), 3 on shallow learning (SL) and 6 control methods on non-ML approaches. We discuss the ML strategies implemented and perform three experiments in which we evaluate the prediction of (I) representatives of the RNA equivalence classes, (II) selected Rfam sequences and (III) RNAs from new Rfam families. We show that DL-based algorithms (such as SPOT-RNA and UFold) can outperform SL and traditional methods if the data distribution is similar in the training and testing set. However, when predicting 2D structures for new RNA families, the advantage of DL is no longer clear, and its performance is inferior or equal to that of SL and non-ML methods.

Keywords: RNA 2D structure prediction; algorithm benchmarking; deep learning; machine learning.

PubMed Disclaimer

Figures

Figure 1
Figure 1
Interaction Network Fidelity (INF) computed for canonical base pairs, in Experiment I (predicting representatives of equivalence classes). Colors refer to groups of algorithms: blue – deep learning (DL), orange – shallow learning (SL) and green – non-ML algorithms.
Figure 2
Figure 2
Interaction Network Fidelity (INF) computed for canonical base pairs, in Experiment II (predicting selected Rfam sequences). Colors refer to groups of algorithms: blue – deep learning (DL), orange – shallow learning (SL) and green – non-ML algorithms.
Figure 3
Figure 3
Interaction Network Fidelity (INF) computed for canonical base pairs, in Experiment III (predicting new Rfam families). Colors refer to groups of algorithms: blue – deep learning (DL), orange – shallow learning (SL) and green – non-ML algorithms.

Similar articles

Cited by

References

    1. Mortimer SA, Kidwell MA, Doudna JA. Insights into RNA structure and function from genome-wide studies. Nat Rev Genet 2014;15(7):469–79. - PubMed
    1. Meister G, Tuschl T. Mechanisms of gene silencing by double-stranded RNA. Nature 2004;431:343–9. - PubMed
    1. Serganov A, Nudler E. A decade of riboswitches. Cell 2013;152(1–2):17–24. - PMC - PubMed
    1. Wu L, Belasco JG. Let me count the ways: mechanisms of gene regulation by miRNAs and siRNAs. Mol Cell 2008;29(1):1–7. - PubMed
    1. Zou Q, Li J, Hong Q, et al. . Prediction of microRNA-disease associations based on social network analysis methods. Biomed Res Int 2015;2015:810514. - PMC - PubMed

Publication types