Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2022 Mar 3;109(3):433-445.
doi: 10.1016/j.ajhg.2022.01.018. Epub 2022 Feb 22.

GWAS of longitudinal trajectories at biobank scale

Affiliations

GWAS of longitudinal trajectories at biobank scale

Seyoon Ko et al. Am J Hum Genet. .

Abstract

Biobanks linked to massive, longitudinal electronic health record (EHR) data make numerous new genetic research questions feasible. One among these is the study of biomarker trajectories. For example, high blood pressure measurements over visits strongly predict stroke onset, and consistently high fasting glucose and Hb1Ac levels define diabetes. Recent research reveals that not only the mean level of biomarker trajectories but also their fluctuations, or within-subject (WS) variability, are risk factors for many diseases. Glycemic variation, for instance, is recently considered an important clinical metric in diabetes management. It is crucial to identify the genetic factors that shift the mean or alter the WS variability of a biomarker trajectory. Compared to traditional cross-sectional studies, trajectory analysis utilizes more data points and captures a complete picture of the impact of time-varying factors, including medication history and lifestyle. Currently, there are no efficient tools for genome-wide association studies (GWASs) of biomarker trajectories at the biobank scale, even for just mean effects. We propose TrajGWAS, a linear mixed effect model-based method for testing genetic effects that shift the mean or alter the WS variability of a biomarker trajectory. It is scalable to biobank data with 100,000 to 1,000,000 individuals and many longitudinal measurements and robust to distributional assumptions. Simulation studies corroborate that TrajGWAS controls the type I error rate and is powerful. Analysis of eleven biomarkers measured longitudinally and extracted from UK Biobank primary care data for more than 150,000 participants with 1,800,000 observations reveals loci that significantly alter the mean or WS variability.

Keywords: biobank; biomarkers; disease progression; electronic medical records; longitudinal; trend; variations; within-subject variability.

PubMed Disclaimer

Conflict of interest statement

Declaration of interests The authors declare no competing interests.

Figures

Figure 1
Figure 1
Illustration of TrajGWAS TrajGWAS identifies the genetic factors that shift the mean or alter the within-subject (WS) variability of a biomarker trajectory (e.g., blood pressure measurements over visits), which changes with time-varying covariates (e.g., medication history) or time-invariant covariates (e.g., sex).
Figure 2
Figure 2
Quantile-quantile (QQ) plots of p values for simulation studies (A and B) Quantile-quantile (QQ) plots of p values from the score test for testing βg (A) and for testing τg (B), where m = 6,000, ni = 6 to 10, and minor allele frequency (MAF) = 0.01, based on 109 replicates. (C and D) QQ plots of SPA-corrected p values for testing βg (C) and for testing τg (D); simulations are based on m = 6,000, ni = 6 to 10, and minor allele frequency (MAF) = 0.01. SPA corrects the deflation or inflation that occurs in the score test at low MAFs. QQ plots for all simulation scenarios are shown in Figures S2–S5.
Figure 3
Figure 3
Empirical powers of testing βg with score test and SPA Each row contains the same sample size m and number of observations per individual ni. Power is evaluated at the significance level α=5×108. Each scenario is based on 1,000 replicates.
Figure 4
Figure 4
Manhattan plots for testing τg for longitudinal markers in the UK Biobank study Manhattan plots for testing τg, the effects of the WS variability, for 11 longitudinal biomarkers in the UK Biobank study. The blue line represents the genome-wide significance level, 5×108.
Figure 5
Figure 5
SNPs that significantly change the WS variability while not significantly shifting the mean SNPs that significantly change the WS variability of longitudinal biomarkers (top) and total cholesterol (TC) trajectory (bottom) while not significantly shifting the mean of their trajectories. Each dot is a SNP that passes the genome-wide significance level (dashed line) for τg but not for βg.

References

    1. Khera A.V., Chaffin M., Wade K.H., Zahid S., Brancale J., Xia R., Distefano M., Senol-Cosar O., Haas M.E., Bick A., et al. Polygenic prediction of weight and obesity trajectories from birth to adulthood. Cell. 2019;177:587–596.e9. - PMC - PubMed
    1. Tanaka T., Basisty N., Fantoni G., Candia J., Moore A.Z., Biancotto A., Schilling B., Bandinelli S., Ferrucci L. Plasma proteomic biomarker signature of age predicts health and life span. eLife. 2020;9:e61073. - PMC - PubMed
    1. Kerschbaum J., Rudnicki M., Dzien A., Dzien-Bischinger C., Winner H., Heerspink H.L., Rosivall L., Wiecek A., Mark P.B., Eder S., et al. Intra-individual variability of eGFR trajectories in early diabetic kidney disease and lack of performance of prognostic biomarkers. Sci. Rep. 2020;10:19743. - PMC - PubMed
    1. Klarin D., Damrauer S.M., Cho K., Sun Y.V., Teslovich T.M., Honerlaw J., Gagnon D.R., DuVall S.L., Li J., Peloso G.M., et al. Genetics of blood lipids among ∼300,000 multi-ethnic participants of the Million Veteran Program. Nat. Genet. 2018;50:1514–1523. . - PMC - PubMed
    1. Kanai M., Akiyama M., Takahashi A., Matoba N., Momozawa Y., Ikeda M., Iwata N., Ikegawa S., Hirata M., Matsuda K., et al. Genetic analysis of quantitative traits in the Japanese population links cell types to complex human diseases. Nat. Genet. 2018;50:390–400. - PubMed

Publication types