Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2021 Jan 13;13(1):6.
doi: 10.1186/s13073-020-00820-8.

Clinical laboratory test-wide association scan of polygenic scores identifies biomarkers of complex disease

Affiliations

Clinical laboratory test-wide association scan of polygenic scores identifies biomarkers of complex disease

Jessica K Dennis et al. Genome Med. .

Abstract

Background: Clinical laboratory (lab) tests are used in clinical practice to diagnose, treat, and monitor disease conditions. Test results are stored in electronic health records (EHRs), and a growing number of EHRs are linked to patient DNA, offering unprecedented opportunities to query relationships between genetic risk for complex disease and quantitative physiological measurements collected on large populations.

Methods: A total of 3075 quantitative lab tests were extracted from Vanderbilt University Medical Center's (VUMC) EHR system and cleaned for population-level analysis according to our QualityLab protocol. Lab values extracted from BioVU were compared with previous population studies using heritability and genetic correlation analyses. We then tested the hypothesis that polygenic risk scores for biomarkers and complex disease are associated with biomarkers of disease extracted from the EHR. In a proof of concept analyses, we focused on lipids and coronary artery disease (CAD). We cleaned lab traits extracted from the EHR performed lab-wide association scans (LabWAS) of the lipids and CAD polygenic risk scores across 315 heritable lab tests then replicated the pipeline and analyses in the Massachusetts General Brigham Biobank.

Results: Heritability estimates of lipid values (after cleaning with QualityLab) were comparable to previous reports and polygenic scores for lipids were strongly associated with their referent lipid in a LabWAS. LabWAS of the polygenic score for CAD recapitulated canonical heart disease biomarker profiles including decreased HDL, increased pre-medication LDL, triglycerides, blood glucose, and glycated hemoglobin (HgbA1C) in European and African descent populations. Notably, many of these associations remained even after adjusting for the presence of cardiovascular disease and were replicated in the MGBB.

Conclusions: Polygenic risk scores can be used to identify biomarkers of complex disease in large-scale EHR-based genomic analyses, providing new avenues for discovery of novel biomarkers and deeper understanding of disease trajectories in pre-symptomatic individuals. We present two methods and associated software, QualityLab and LabWAS, to clean and analyze EHR labs at scale and perform a Lab-Wide Association Scan.

Keywords: Biomarkers; Electronic health records; Genetic epidemiology; Population genetics.

PubMed Disclaimer

Conflict of interest statement

JWS is an unpaid member of the Bipolar/Depression Research Community Advisory Panel of 23andMe, is a member of the Leon Levy Foundation Neuroscience Advisory Board, and received an honorarium for an internal seminar at Biogen, Inc. He is PI of a collaborative study of the genetics of depression and bipolar disorder sponsored by 23andMe for which 23andMe provides analysis time as in-kind support but no payments. The remaining authors declare that they have no competing interests.

Figures

Fig. 1
Fig. 1
Selection of BioVU patients and datasets for different analyses presented in this manuscript. a BioVU patients were selected in parallel for clinical laboratory (lab) test cleaning and for genotyping. b Lab-specific quality control filters and subsetting were applied to the 939 lab tests in the 94,474 patients with clean lab data. Parallelograms denote input and output datasets. Options highlighted in green were selected for the proof-of-principle analyses of blood-based lipid lab values
Fig. 2
Fig. 2
Heritability and GWAS analyses of lipids. a Estimates of heritability computed by GCTA in BioVU patients were robust to excluding individuals with a diagnosis of CAD and to removing post-medication observations. b Estimates of heritability computed using GWAS summary statistics and LDSC were comparable across BioVU and the Global Lipids Genetic Consortium (GLGC) and Million Veteran’s Project (MVP) samples. c Genetic correlations between lipid levels in BioVU and the Global Lipids Genetic Consortium (GLGC) or Million Veteran’s Program (MVP) calculated using LDSC or high-definition likelihood (HDL). Stars denote statistically significant correlations
Fig. 3
Fig. 3
LabWAS of PGSHDL in a individuals of European ancestry (EA) and b individuals of African ancestry (AA), LabWAS of PGSLDL in c EA and d AA, and LabWAS of PGSTG in e EA and f AA. The red line indicates the Bonferroni threshold for statistical significance and the blue line indicates a p value of 0.05. Upward triangles indicate that the PGS is associated with increased levels of the lab, while downward triangles indicate an association with reduced levels of the lab
Fig. 4
Fig. 4
LabWAS of PGSCAD in individuals of a European ancestry and b individuals of African ancestry. LabWAS of PGSCAD after controlling for CAD diagnosis in individuals of c European ancestry and d individuals of African ancestry. The red lines indicate the Bonferroni threshold for statistical significance and the blue line indicates a p value of 0.05. Upward triangles indicate that the PGSCAD is associated with increased levels of the lab, while downward triangles indicate an association with reduced levels of the lab

References

    1. Shameer K, Denny JC, Ding K, Jouni H, Crosslin DR, de Andrade M, et al. A genome- and phenome-wide association study to identify genetic variants influencing platelet count and volume and their pleiotropic effects. Hum Genet. 2014;133:95–109. doi: 10.1007/s00439-013-1355-7. - DOI - PMC - PubMed
    1. Hoffmann TJ, Theusch E, Haldar T, Ranatunga DK, Jorgenson E, Medina MW, et al. A large electronic-health-record-based genome-wide study of serum lipids. Nat Genet. 2018. 10.1038/s41588-018-0064-5. - PMC - PubMed
    1. Verma A, Lucas A, Verma SS, Zhang Y, Josyula N, Khan A, et al. PheWAS and beyond: the landscape of associations with medical diagnoses and clinical measures across 38,662 individuals from Geisinger. Am J Hum Genet. 2018. 10.1016/j.ajhg.2018.02.017. - PMC - PubMed
    1. Klarin D, Damrauer SM, Cho K, Sun YV, Teslovich TM, Honerlaw J, et al. Genetics of blood lipids among ~300,000 multi-ethnic participants of the million veteran program. Nat Genet. 2018;50:1514–1523. doi: 10.1038/s41588-018-0222-9. - DOI - PMC - PubMed
    1. Verma A, Leader JB, Verma SS, Frase A, Wallace J, Dudek S, et al. Integrating clinical laboratory measures and ICD-9 code diagnoses in phenome-wide association studies. Pac Symp Biocomput. 2016. 10.1142/9789814749411_0016. - PMC - PubMed