Artificial intelligence methods enhance the discovery of RNA interactions
- PMID: 36275611
- PMCID: PMC9585310
- DOI: 10.3389/fmolb.2022.1000205
Artificial intelligence methods enhance the discovery of RNA interactions
Abstract
Understanding how RNAs interact with proteins, RNAs, or other molecules remains a challenge of main interest in biology, given the importance of these complexes in both normal and pathological cellular processes. Since experimental datasets are starting to be available for hundreds of functional interactions between RNAs and other biomolecules, several machine learning and deep learning algorithms have been proposed for predicting RNA-RNA or RNA-protein interactions. However, most of these approaches were evaluated on a single dataset, making performance comparisons difficult. With this review, we aim to summarize recent computational methods, developed in this broad research area, highlighting feature encoding and machine learning strategies adopted. Given the magnitude of the effect that dataset size and quality have on performance, we explored the characteristics of these datasets. Additionally, we discuss multiple approaches to generate datasets of negative examples for training. Finally, we describe the best-performing methods to predict interactions between proteins and specific classes of RNA molecules, such as circular RNAs (circRNAs) and long non-coding RNAs (lncRNAs), and methods to predict RNA-RNA or RNA-RBP interactions independently of the RNA type.
Keywords: RNA; RNA interaction predictors; RNA secondary structure; RNA sequence; deep learning; embedding; machine learning; natural language processing.
Copyright © 2022 Pepe, Appierdo, Carrino, Ballesio, Helmer-Citterich and Gherardini.
Conflict of interest statement
The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.
Figures
References
-
- Bai Y., Dai X., Ye T., Zhang P., Yan X., Gong X., et al. (2019). PlncRNADB: A repository of plant lncRNAs and lncRNA-RBP protein interactions. Curr. Bioinform. 14, 621–627. 10.2174/1574893614666190131161002 - DOI
-
- Burley S. K., Bhikadiya C., Bi C., Bittrich S., Chen L., Crichlow G. V., et al. (2021). RCSB protein data bank: Powerful new tools for exploring 3D structures of biological macromolecules for basic and applied research and education in fundamental biology, biomedicine, biotechnology, bioengineering and energy sciences. Nucleic Acids Res. 49, D437–D451. 10.1093/nar/gkaa1038 - DOI - PMC - PubMed
Publication types
LinkOut - more resources
Full Text Sources
