Scoping review and classification of deep learning in medical genetics

Suzanna E Ledgister Hanchard¹, Michelle C Dwyer¹, Simon Liu¹, Ping Hu¹, Cedrik Tekendo-Ngongang¹, Rebekah L Waikel¹, Dat Duong¹, Benjamin D Solomon²

Affiliations

¹ Medical Genomics Unit, National Human Genome Research Institute, Bethesda, MD.
² Medical Genomics Unit, National Human Genome Research Institute, Bethesda, MD. Electronic address: solomonb@mail.nih.gov.

PMID: 35612590
PMCID: PMC11056027
DOI: 10.1016/j.gim.2022.04.025

Scoping review and classification of deep learning in medical genetics

Suzanna E Ledgister Hanchard et al. Genet Med. 2022 Aug.

. 2022 Aug;24(8):1593-1603.

doi: 10.1016/j.gim.2022.04.025. Epub 2022 May 25.

Authors

Suzanna E Ledgister Hanchard¹, Michelle C Dwyer¹, Simon Liu¹, Ping Hu¹, Cedrik Tekendo-Ngongang¹, Rebekah L Waikel¹, Dat Duong¹, Benjamin D Solomon²

Affiliations

¹ Medical Genomics Unit, National Human Genome Research Institute, Bethesda, MD.
² Medical Genomics Unit, National Human Genome Research Institute, Bethesda, MD. Electronic address: solomonb@mail.nih.gov.

PMID: 35612590
PMCID: PMC11056027
DOI: 10.1016/j.gim.2022.04.025

Abstract

Deep learning (DL) is applied in many biomedical areas. We performed a scoping review on DL in medical genetics. We first assessed 14,002 articles, of which 133 involved DL in medical genetics. DL in medical genetics increased rapidly during the studied period. In medical genetics, DL has largely been applied to small data sets of affected individuals (mean = 95, median = 29) with genetic conditions (71 different genetic conditions were studied; 24 articles studied multiple conditions). A variety of data types have been used in medical genetics, including radiologic (20%), ophthalmologic (14%), microscopy (8%), and text-based data (4%); the most common data type was patient facial photographs (46%). DL authors and research subjects overrepresent certain geographic areas (United States, Asia, and Europe). Convolutional neural networks (89%) were the most common method. Results were compared with human performance in 31% of studies. In total, 51% of articles provided data access; 16% released source code. To further explore DL in genomics, we conducted an additional analysis, the results of which highlight future opportunities for DL in medical genetics. Finally, we expect DL applications to increase in the future. To aid data curation, we evaluated a DL, random forest, and rule-based classifier at categorizing article abstracts.

Keywords: Artificial intelligence; Deep learning; Machine learning; Medical genetics; Medical genomics.

Published by Elsevier Inc.

PubMed Disclaimer

Conflict of interest statement

Conflict of Interest The authors declare no conflicts of interest.

Figures

**Figure 1. PRISMA schema for data collection and categorization.**
Although we performed a scoping review, this schema was adapted from the one used for systematic reviews and is used with appropriate permission and citation as described in the guidelines.^, Sources used include Clinical Genomic Database (CGD), Face2Gene, OMIM, and PubMed.^- PRISMA, Preferred Reporting Items for Systematic Reviews and Meta-Analyses.

**Figure 2. Articles per year and characteristics of studied individuals.**
A. Number of articles per year binned as category 1 (articles on deep learning [DL] applied to genetic conditions). Articles from 2021 includes observed articles as well as projected articles, the latter was calculated on the basis of the observed trend during the depicted time period (January 2015-June 2021). B. Distribution of genetic conditions studied using DL. C. Number of individuals with the studied genetic conditions included in each study. Further details are available in Supplemental Table 3.

**Figure 3. Characteristics of methods used.**
A. Types of clinical input data analyzed via deep learning (DL). B. Types of DL methods used in each article. C. Categorization of the primary use of DL in each article. Further details are available in Supplemental Table 3. BERT, bidirectional encoder representations from transformers; CNN, convolutional neural network; ECG, electrocardiogram; EEG, electroencephalogram; RNN, recurrent neural network.

**Figure 4. Geographic distribution of articles.**
A. Location of the corresponding author(s) for each of the 134 articles. B. Location of study populations for articles with available data.

See this image and copyright information in PMC

References

1. Jumper J, Evans R, Pritzel A, et al. Highly accurate protein structure prediction with AlphaFold. Nature. 2021;596(7873):583–589. 10.1038/s41586-021-03819-2. - DOI - PMC - PubMed
1. Baek M, DiMaio F, Anishchenko I, et al. Accurate prediction of protein structures and interactions using a three-track neural network. Science. 2021;373(6557):871–876. 10.1126/science.abj8754. - DOI - PMC - PubMed
1. LeCun Y, Bengio Y, Hinton G. Deep learning. Nature. 2015;521(7553):436–444. 10.1038/nature14539. - DOI - PubMed
1. Guo Y, Liu Y, Oerlemans A, Lao S, Wu S, Lew MS. Deep learning for visual understanding: a review. Neurocomputing. 2016;187:27–48. 10.1016/j.neucom.2015.09.116. - DOI
1. Lee J, Yoon W, Kim S, et al. BioBERT: a pre-trained biomedical language representation model for biomedical text mining. Bioinformatics. 2020;36(4):1234–1240. 10.1093/bioinformatics/btz682. - DOI - PMC - PubMed

Publication types

Actions
Actions

MeSH terms

Actions
Actions
Actions
Actions
Actions
Actions

Grants and funding

ZIA HG200405/ImNIH/Intramural NIH HHS/United States

LinkOut - more resources

Full Text Sources

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Scoping review and classification of deep learning in medical genetics

Affiliations

Scoping review and classification of deep learning in medical genetics

Authors

Affiliations

Abstract

Conflict of interest statement

Figures

References

Publication types

MeSH terms

Grants and funding

LinkOut - more resources

Full Text Sources