Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Review
. 2022 May 23:9:869601.
doi: 10.3389/fmolb.2022.869601. eCollection 2022.

Deep Learning in RNA Structure Studies

Affiliations
Review

Deep Learning in RNA Structure Studies

Haopeng Yu et al. Front Mol Biosci. .

Abstract

Deep learning, or artificial neural networks, is a type of machine learning algorithm that can decipher underlying relationships from large volumes of data and has been successfully applied to solve structural biology questions, such as RNA structure. RNA can fold into complex RNA structures by forming hydrogen bonds, thereby playing an essential role in biological processes. While experimental effort has enabled resolving RNA structure at the genome-wide scale, deep learning has been more recently introduced for studying RNA structure and its functionality. Here, we discuss successful applications of deep learning to solve RNA problems, including predictions of RNA structures, non-canonical G-quadruplex, RNA-protein interactions and RNA switches. Following these cases, we give a general guide to deep learning for solving RNA structure problems.

Keywords: RNA G-quadruplex; RNA secondary structure; RNA structure prediction; RNA tertiary structure; RNA-protein interaction; deep learning.

PubMed Disclaimer

Conflict of interest statement

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Figures

FIGURE 1
FIGURE 1
Schematic overview of deep learning workflow. (A) Data processing. Supervised learning requires explicit labelling of the data, including class numbers in classification questions and values in regression questions. (B) Model design. Multilayer perceptron (MLP), convolutional neural network (CNN) and recurrent neural network (RNN) are the three main families of deep learning architecture. Typically, deep learning models assemble different architectures based on data structures. (C) Model training. The total training data is first divided into the training set, the validation set and the test set. Then the input data is passed into the model to obtain the predicted values. The loss function is applied to evaluate the difference between the predicted and the true values, whereby the model weights are updated. (D) Model interpretation. Features’ importance can be obtained by in silico mutations. For the CNN model, the features can also be evaluated by extracting the weight matrix of the filter.

References

    1. Alipanahi B., Delong A., Weirauch M. T., Frey B. J. (2015). Predicting the Sequence Specificities of DNA- and RNA-Binding Proteins by Deep Learning. Nat. Biotechnol. 33, 831–838. 10.1038/nbt.3300 - DOI - PubMed
    1. Angenent-Mari N. M., Garruss A. S., Soenksen L. R., Church G., Collins J. J. (2020). A Deep Learning Approach to Programmable RNA Switches. Nat. Commun. 11, 5057. 10.1038/s41467-020-18677-1 - DOI - PMC - PubMed
    1. Angermueller C., Pärnamaa T., Parts L., Stegle O. (2016). Deep Learning for Computational Biology. Mol. Syst. Biol. 12, 878. 10.15252/msb.20156651 - DOI - PMC - PubMed
    1. Baek M., DiMaio F., Anishchenko I., Dauparas J., Ovchinnikov S., Lee G. R., et al. (2021). Accurate Prediction of Protein Structures and Interactions Using a Three-Track Neural Network. Science 373, 871–876. 10.1126/science.abj8754 - DOI - PMC - PubMed
    1. Barshai M., Aubert A., Orenstein Y. (2021). G4detector: Convolutional Neural Network to Predict DNA G-Quadruplexes. IEEE/ACM Trans. Comput. Biol. Bioinf., 1. 10.1109/TCBB.2021.3073595 - DOI - PubMed

LinkOut - more resources