Review

. 2022 Nov 18:14:1027224.

doi: 10.3389/fnagi.2022.1027224. eCollection 2022.

Deep learning approaches for noncoding variant prioritization in neurodegenerative diseases

Alexander Y Lan^{1

2

3}, M Ryan Corces^{1

2

3}

Affiliations

¹ Gladstone Institute of Neurological Disease, San Francisco, CA, United States.
² Gladstone Institute of Data Science and Biotechnology, San Francisco, CA, United States.
³ Department of Neurology, University of California San Francisco, San Francisco, CA, United States.

PMID: 36466610
PMCID: PMC9716280
DOI: 10.3389/fnagi.2022.1027224

Review

Deep learning approaches for noncoding variant prioritization in neurodegenerative diseases

Alexander Y Lan et al. Front Aging Neurosci. 2022.

. 2022 Nov 18:14:1027224.

doi: 10.3389/fnagi.2022.1027224. eCollection 2022.

Authors

Alexander Y Lan^{1

2

3}, M Ryan Corces^{1

2

3}

Affiliations

¹ Gladstone Institute of Neurological Disease, San Francisco, CA, United States.
² Gladstone Institute of Data Science and Biotechnology, San Francisco, CA, United States.
³ Department of Neurology, University of California San Francisco, San Francisco, CA, United States.

PMID: 36466610
PMCID: PMC9716280
DOI: 10.3389/fnagi.2022.1027224

Abstract

Determining how noncoding genetic variants contribute to neurodegenerative dementias is fundamental to understanding disease pathogenesis, improving patient prognostication, and developing new clinical treatments. Next generation sequencing technologies have produced vast amounts of genomic data on cell type-specific transcription factor binding, gene expression, and three-dimensional chromatin interactions, with the promise of providing key insights into the biological mechanisms underlying disease. However, this data is highly complex, making it challenging for researchers to interpret, assimilate, and dissect. To this end, deep learning has emerged as a powerful tool for genome analysis that can capture the intricate patterns and dependencies within these large datasets. In this review, we organize and discuss the many unique model architectures, development philosophies, and interpretation methods that have emerged in the last few years with a focus on using deep learning to predict the impact of genetic variants on disease pathogenesis. We highlight both broadly-applicable genomic deep learning methods that can be fine-tuned to disease-specific contexts as well as existing neurodegenerative disease research, with an emphasis on Alzheimer's-specific literature. We conclude with an overview of the future of the field at the intersection of neurodegeneration, genomics, and deep learning.

Keywords: gene regulation; genomics; machine learning; neurodegeneration; noncoding genetic variation.

PubMed Disclaimer

Conflict of interest statement

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Figures

**Figure. 1**
Model and layer architectures. **(A)** Diagram of the fully-connected architecture present in ANNs. Every node is connected with all nodes of the previous layer and all nodes of the following layer. **(B)** Diagram of a single convolutional filter within a single convolutional layer. Every element in the shaded input matrix is multiplied by the corresponding weight in the convolutional filter and combined to form one output value in the shaded output square. (C) Depiction of the recurrent neural network architecture, where the primary ANN block takes the current input along with memory information stored over short or long distances. (D) Flowchart of the transformer multi-head attention layer, which first takes a list of inputs and passes them through three ANN blocks. Together, the query and key matrix outputs form attention filters, which when multiplied with the outputs of the value matrix, generates a list of filtered output matrices. Each attention filter may highlight a different part of the input. The final output ANN is used to reduce the number of dimensions back to the original input size.

**Figure 2**
Sample genomics DL model with convolutional, attention, and intermediate layers. This model representation captures the most basic architecture used by most genomics DL models. The input DNA sequence is first one-hot encoded into the 4-by-N matrix shown on the left, then a convolutional layer extracts certain patterns by traversing the input sequence with multiple filters, whose weights are learned during training. Both standard and diluted convolutional layers are shown. Along with more convolutional or attention layers, model designers often use intermediate layers to simplify computation, consolidate data representations, or learn more patterns. Examples of intermediate layers include fully-connected, RNN, cropping, flatten, or pooling layers. Lastly, the model outputs either a predicted genomic track as shown or a single label representing the amount of enriched signal for the entire sequence.

See this image and copyright information in PMC

Cited by

Decoding polygenic diseases: advances in noncoding variant prioritization and validation.
Chin IM, Gardell ZA, Corces MR. Chin IM, et al. Trends Cell Biol. 2024 Jun;34(6):465-483. doi: 10.1016/j.tcb.2024.03.005. Epub 2024 May 7. Trends Cell Biol. 2024. PMID: 38719704 Free PMC article. Review.
AI-powered precision medicine: utilizing genetic risk factor optimization to revolutionize healthcare.
Alsaedi S, Ogasawara M, Alarawi M, Gao X, Gojobori T. Alsaedi S, et al. NAR Genom Bioinform. 2025 May 5;7(2):lqaf038. doi: 10.1093/nargab/lqaf038. eCollection 2025 Jun. NAR Genom Bioinform. 2025. PMID: 40330081 Free PMC article. Review.
Genetic insights into immune mechanisms of Alzheimer's and Parkinson's disease.
Nott A, Holtman IR. Nott A, et al. Front Immunol. 2023 Jun 8;14:1168539. doi: 10.3389/fimmu.2023.1168539. eCollection 2023. Front Immunol. 2023. PMID: 37359515 Free PMC article. Review.
Role and Potential of Artificial Intelligence in Biomarker Discovery and Development of Treatment Strategies for Amyotrophic Lateral Sclerosis.
Kitaoka Y, Uchihashi T, Kawata S, Nishiura A, Yamamoto T, Hiraoka SI, Yokota Y, Isomura ET, Kogo M, Tanaka S, Spigelman I, Seki S. Kitaoka Y, et al. Int J Mol Sci. 2025 May 2;26(9):4346. doi: 10.3390/ijms26094346. Int J Mol Sci. 2025. PMID: 40362582 Free PMC article. Review.
Visceral fat and attribute-based medicine in chronic kidney disease.
Kataoka H, Nitta K, Hoshino J. Kataoka H, et al. Front Endocrinol (Lausanne). 2023 Feb 9;14:1097596. doi: 10.3389/fendo.2023.1097596. eCollection 2023. Front Endocrinol (Lausanne). 2023. PMID: 36843595 Free PMC article. Review.

References

1. Abadi M., Agarwal A., Barham P., Brevdo E., Chen Z., Citro C., et al. (2016). TensorFlow: large-scale machine learning on heterogeneous distributed systems. ArXiv 308–318. doi: 10.5555/3026877.3026899 - DOI
1. Acheampong F. A., Nunoo-Mensah H., Chen W. (2021). Transformer models for text-based emotion detection: a review of BERT-based approaches. Artif. Intell. Rev. 54, 5789–5829. doi: 10.1007/s10462-021-09958-2 - DOI
1. Agarwal V., Shendure J. (2020). Predicting mRNA abundance directly from genomic sequence using deep convolutional neural networks. Cell Rep. 31:107663. doi: 10.1016/j.celrep.2020.107663, PMID: - DOI - PubMed
1. Alipanahi B., Delong A., Weirauch M. T., Frey B. J. (2015). Predicting the sequence specificities of DNA- and RNA-binding proteins by deep learning. Nat. Biotec39hnol. 33, 831–838. doi: 10.1038/nbt.3300, PMID: - DOI - PubMed
1. Amariuta T., Luo Y., Gazal S., Davenport E. E., van de Geijn B., Ishigaki K., et al. (2019). IMPACT: genomic annotation of cell-state-specific regulatory elements inferred from the Epigenome of bound transcription factors. Am. J. Hum. Genet. 104, 879–895. doi: 10.1016/j.ajhg.2019.03.012, PMID: - DOI - PMC - PubMed

Publication types

Actions

Grants and funding

P01 AG073082/AG/NIA NIH HHS/United States

LinkOut - more resources

Full Text Sources

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Deep learning approaches for noncoding variant prioritization in neurodegenerative diseases

Affiliations

Deep learning approaches for noncoding variant prioritization in neurodegenerative diseases

Authors

Affiliations

Abstract

Conflict of interest statement

Figures

Similar articles

Cited by

References

Publication types

Grants and funding

LinkOut - more resources

Full Text Sources

Abstract

Conflict of interest statement

Figures

Similar articles

Cited by

References

Publication types

Related information

Grants and funding

LinkOut - more resources

Full Text Sources