Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Review
. 2018 May 1;27(R1):R29-R34.
doi: 10.1093/hmg/ddy088.

Biomedical informatics and machine learning for clinical genomics

Affiliations
Review

Biomedical informatics and machine learning for clinical genomics

James A Diao et al. Hum Mol Genet. .

Abstract

While tens of thousands of pathogenic variants are used to inform the many clinical applications of genomics, there remains limited information on quantitative disease risk for the majority of variants used in clinical practice. At the same time, rising demand for genetic counselling has prompted a growing need for computational approaches that can help interpret genetic variation. Such tasks include predicting variant pathogenicity and identifying variants that are too common to be penetrant. To address these challenges, researchers are increasingly turning to integrative informatics approaches. These approaches often leverage vast sources of data, including electronic health records and population-level allele frequency databases (e.g. gnomAD), as well as machine learning techniques such as support vector machines and deep learning. In this review, we highlight recent informatics and machine learning approaches that are improving our understanding of pathogenic variation and discuss obstacles that may limit their emerging role in clinical genomics.

PubMed Disclaimer

Figures

Figure 1.
Figure 1.
Predicting variant pathogenicity status using a neural network. Schematic representation of a neural network that predicts the pathogenicity status (pathogenic versus benign) of a genetic variant using a large number of input features, including sequence conservation, regulatory information and protein-level annotations. Feature scores are passed serially through successive interconnected layers and trained using a large set of labeled variants. Two hidden layers are shown in the schematic diagram above, but modern networks often consist of many more layers.

Similar articles

Cited by

References

    1. Visscher P.M., Brown M.A., McCarthy M.I., Yang J. (2012) Five years of GWAS discovery. Am. J. Hum. Genet., 90, 7–24. - PMC - PubMed
    1. Bamshad M.J., Ng S.B., Bigham A.W., Tabor H.K., Emond M.J., Nickerson D.A., Shendure J. (2011) Exome sequencing as a tool for Mendelian disease gene discovery. Nat. Rev. Genet., 12, 745–755. - PubMed
    1. Rehm H.L., Bale S.J., Bayrak-Toydemir P., Berg J.S., Brown K.K., Deignan J.L., Friez M.J., Funke B.H., Hegde M.R., Lyon E. (2013) ACMG clinical laboratory standards for next-generation sequencing. Genet. Med., 15, 733–747. - PMC - PubMed
    1. Manrai A.K., Ioannidis J.P.A., Kohane I.S. (2016) Clinical genomics: from pathogenicity claims to quantitative risk estimates. JAMA, 315, 1233–1234. - PubMed
    1. Panagiotou O.A., Willer C.J., Hirschhorn J.N., Ioannidis J.P.A. (2013) The power of meta-analysis in genome-wide association studies. Annu. Rev. Genomics Hum. Genet., 14, 441–465. - PMC - PubMed

Publication types