Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Review
. 2018 May 1;27(R1):R14-R21.
doi: 10.1093/hmg/ddy081.

Electronic health records: the next wave of complex disease genetics

Affiliations
Review

Electronic health records: the next wave of complex disease genetics

Brooke N Wolford et al. Hum Mol Genet. .

Abstract

The combination of electronic health records (EHRs) with genetic data has ushered in the next wave of complex disease genetics. Population-based biobanks and other large cohorts provide sufficient sample sizes to identify novel genetic associations across the hundreds to thousands of phenotypes gleaned from EHRs. In this review, we summarize the current state of these EHR-linked biobanks, explore ongoing methods development in the field and highlight recent discoveries of genetic associations. We enumerate the many existing biobanks with EHRs linked to genetic data, many of which are available to researchers via application and contain sample sizes >50 000. We also discuss the computational and statistical considerations for analysis of such large datasets including mixed models, phenotype curation and cloud computing. Finally, we demonstrate how genome-wide association studies and phenome-wide association studies have identified novel genetic findings for complex diseases, specifically cardiometabolic traits. As more researchers employ innovative hypotheses and analysis approaches to study EHR-linked biobanks, we anticipate a richer understanding of the genetic etiology of complex diseases.

PubMed Disclaimer

Figures

Figure 1.
Figure 1.
Locus zoom plot of the lead variant (rs116843064) in ANGPTL4 from the PheGWAS available at University of Michigan’s PheWeb. The variant is associated with coronary atherosclerosis (P-value <1.6e−7) in 20 023 cases and 377 103 controls in UKBB. The variant is also associated with other phenotypes at phenome-wide significance (P-value < 5e−5) including hypercholesterolemia and ischemic heart disease as expected. Notably, this variant is also associated with ankylosing spondylitis—a form of arthritis affecting the spine and large joints. While ankylosing spondylitis is seemingly pathologically different than CAD, a link between the two has been reported previously (51). The constellation of associations across circulatory, metabolic and musculoskeletal systems provides evidence for pleiotropy or shared pathways for disease pathogenesis.

References

    1. Kohane I.S. (2011) Using electronic health records to drive discovery in disease genomics. Nat. Rev. Genet., 12, 417. - PubMed
    1. Denny J.C., Ritchie M.D., Basford M.A., Pulley J.M., Bastarache L., Brown-Gentry K., Wang D., Masys D.R., Roden D.M., Crawford D.C. (2010) PheWAS: demonstrating the feasibility of a phenome-wide scan to discover gene–disease associations. Bioinformatics, 26, 1205–1210. - PMC - PubMed
    1. Zhou W., Nielsen J.B., Fritsche L.G., Dey R., Elvestad M.B., Wolford B.N., LeFaive J., VandeHaar P., Gifford A., Bastarache L.A.. et al. (2017) Efficiently controlling for case-control imbalance and sample relatedness in large-scale genetic association studies. bioRxiv, 212357 https://doi.org/10.1101/212357. - DOI - PMC - PubMed
    1. Gulcher J., Stefansson K. (1999) An Icelandic saga on a centralized healthcare database and democratic decision making. Nat. Biotechnol., 17, 620.. - PubMed
    1. Pulley J., Clayton E., Bernard G.R., Roden D.M., Masys D.R. (2010) Principles of human subjects protections applied in an opt-out, de-identified biobank. Clin. Transl. Sci., 3, 42–48. - PMC - PubMed

Publication types

MeSH terms