Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Review
. 2018 Oct;22(10):630-636.
doi: 10.1089/omi.2018.0097. Epub 2018 Aug 20.

Rise of Deep Learning for Genomic, Proteomic, and Metabolomic Data Integration in Precision Medicine

Affiliations
Review

Rise of Deep Learning for Genomic, Proteomic, and Metabolomic Data Integration in Precision Medicine

Dmitry Grapov et al. OMICS. 2018 Oct.

Abstract

Machine learning (ML) is being ubiquitously incorporated into everyday products such as Internet search, email spam filters, product recommendations, image classification, and speech recognition. New approaches for highly integrated manufacturing and automation such as the Industry 4.0 and the Internet of things are also converging with ML methodologies. Many approaches incorporate complex artificial neural network architectures and are collectively referred to as deep learning (DL) applications. These methods have been shown capable of representing and learning predictable relationships in many diverse forms of data and hold promise for transforming the future of omics research and applications in precision medicine. Omics and electronic health record data pose considerable challenges for DL. This is due to many factors such as low signal to noise, analytical variance, and complex data integration requirements. However, DL models have already been shown capable of both improving the ease of data encoding and predictive model performance over alternative approaches. It may not be surprising that concepts encountered in DL share similarities with those observed in biological message relay systems such as gene, protein, and metabolite networks. This expert review examines the challenges and opportunities for DL at a systems and biological scale for a precision medicine readership.

Keywords: artificial intelligence; biomarkers; deep learning; machine learning; multiomics data integration; precision medicine.

PubMed Disclaimer

Conflict of interest statement

D.G. is the Director of Data Science and Bioinformatics at CDS—Creative Data Solutions LLC, www.createdatasol.com.

Figures

<b>FIG. 1.</b>
FIG. 1.
Multiomics data integration utilizes empirical, functional, and other techniques to combine information from multiple omics domains. This systems approach enables robust characterization of biochemical signatures reflective of organismal phenotypes.
<b>FIG. 2.</b>
FIG. 2.
DL architectures may provide unique opportunities to encode locally optimal predictors in a variety of organisms (cellular, mouse, primate, and human) and then integrate their representations of omics layers. Through transfer learning, researchers may leverage larger expert-derived models to improve DL performance for their smaller data sets.
<b>FIG. 3.</b>
FIG. 3.
DL model architectures and training techniques share many similarities with biological message passing systems. DL models contain a minimum of three layers: input, hidden, and output. This could mimic representation of relationships between gene transcription, protein expression, and metabolite concentrations, but can also extend other omics layers. Interesting parallels between computational and biological optimizations such as backward propagation in DL and signal inhibition in omics have also emerged.
<b>FIG. 4.</b>
FIG. 4.
Personalized medicine is a quickly growing area of research that requires complex data encoding and integration tasks, which are well suited for DL.

References

    1. Alipanahi B, Delong A, Weirauch MT, and Frey BJ. (2015). Predicting the sequence specificities of DNA- and RNA-binding proteins by deep learning. Nat Biotechnol 33, 831–838 - PubMed
    1. Angermueller C, Lee HJ, Reik W, and Stegle O. (2017). DeepCpG: Accurate prediction of single-cell DNA methylation states using deep learning. Genome Biol 18, 67. - PMC - PubMed
    1. Angermueller C, Parnamaa T, Parts L, and Stegle O. (2016). Deep learning for computational biology. Mol Syst Biol 12, 878. - PMC - PubMed
    1. Breiman L. (2001). Random forests. Machine Learning 45, 5–32
    1. Chen R, and Snyder M. (2013). Promise of personalized omics to precision medicine. Wiley Interdiscip Rev Syst Biol Med 5, 73–82 - PMC - PubMed

Publication types

LinkOut - more resources