Inferring RNA-binding protein target preferences using adversarial domain adaptation
- PMID: 35202389
- PMCID: PMC8870515
- DOI: 10.1371/journal.pcbi.1009863
Inferring RNA-binding protein target preferences using adversarial domain adaptation
Abstract
Precise identification of target sites of RNA-binding proteins (RBP) is important to understand their biochemical and cellular functions. A large amount of experimental data is generated by in vivo and in vitro approaches. The binding preferences determined from these platforms share similar patterns but there are discernable differences between these datasets. Computational methods trained on one dataset do not always work well on another dataset. To address this problem which resembles the classic "domain shift" in deep learning, we adopted the adversarial domain adaptation (ADDA) technique and developed a framework (RBP-ADDA) that can extract RBP binding preferences from an integration of in vivo and vitro datasets. Compared with conventional methods, ADDA has the advantage of working with two input datasets, as it trains the initial neural network for each dataset individually, projects the two datasets onto a feature space, and uses an adversarial framework to derive an optimal network that achieves an optimal discriminative predictive power. In the first step, for each RBP, we include only the in vitro data to pre-train a source network and a task predictor. Next, for the same RBP, we initiate the target network by using the source network and use adversarial domain adaptation to update the target network using both in vitro and in vivo data. These two steps help leverage the in vitro data to improve the prediction on in vivo data, which is typically challenging with a lower signal-to-noise ratio. Finally, to further take the advantage of the fused source and target data, we fine-tune the task predictor using both data. We showed that RBP-ADDA achieved better performance in modeling in vivo RBP binding data than other existing methods as judged by Pearson correlations. It also improved predictive performance on in vitro datasets. We further applied augmentation operations on RBPs with less in vivo data to expand the input data and showed that it can improve prediction performances. Lastly, we explored the predictive interpretability of RBP-ADDA, where we quantified the contribution of the input features by Integrated Gradients and identified nucleotide positions that are important for RBP recognition.
Conflict of interest statement
The authors have declared that no competing interests exist.
Figures




Similar articles
-
Deep neural networks for interpreting RNA-binding protein target preferences.Genome Res. 2020 Feb;30(2):214-226. doi: 10.1101/gr.247494.118. Epub 2020 Jan 28. Genome Res. 2020. PMID: 31992613 Free PMC article.
-
Deep neural networks for inferring binding sites of RNA-binding proteins by using distributed representations of RNA primary sequence and secondary structure.BMC Genomics. 2020 Dec 17;21(Suppl 13):866. doi: 10.1186/s12864-020-07239-w. BMC Genomics. 2020. PMID: 33334313 Free PMC article.
-
Prediction of the RBP binding sites on lncRNAs using the high-order nucleotide encoding convolutional neural network.Anal Biochem. 2019 Oct 15;583:113364. doi: 10.1016/j.ab.2019.113364. Epub 2019 Jul 16. Anal Biochem. 2019. PMID: 31323206
-
Prediction of RNA-protein sequence and structure binding preferences using deep convolutional and recurrent neural networks.BMC Genomics. 2018 Jul 3;19(1):511. doi: 10.1186/s12864-018-4889-1. BMC Genomics. 2018. PMID: 29970003 Free PMC article.
-
Finding the target sites of RNA-binding proteins.Wiley Interdiscip Rev RNA. 2014 Jan-Feb;5(1):111-30. doi: 10.1002/wrna.1201. Epub 2013 Nov 11. Wiley Interdiscip Rev RNA. 2014. PMID: 24217996 Free PMC article. Review.
Cited by
-
Emerging RNA-centric technologies to probe RNA-protein interactions: importance in decoding the life cycle of positive sense single strand RNA viruses and antiviral discovery.Front Cell Infect Microbiol. 2025 May 21;15:1580337. doi: 10.3389/fcimb.2025.1580337. eCollection 2025. Front Cell Infect Microbiol. 2025. PMID: 40584171 Free PMC article. Review.
-
A systematic benchmark of machine learning methods for protein-RNA interaction prediction.Brief Bioinform. 2023 Sep 20;24(5):bbad307. doi: 10.1093/bib/bbad307. Brief Bioinform. 2023. PMID: 37635383 Free PMC article.
References
Publication types
MeSH terms
Substances
LinkOut - more resources
Full Text Sources