Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
[Preprint]. 2023 Jun 29:2023.06.28.23291975.
doi: 10.1101/2023.06.28.23291975.

Diversity and Scale: Genetic Architecture of 2,068 Traits in the VA Million Veteran Program

Anurag Verma  1   2   3 Jennifer E Huffman  4   5   6 Alex Rodriguez  7 Mitchell Conery  8 Molei Liu  9 Yuk-Lam Ho  4 Youngdae Kim  10 David A Heise  11 Lindsay Guare  2 Vidul Ayakulangara Panickan  12 Helene Garcon  4 Franciel Linares  13 Lauren Costa  14 Ian Goethert  15 Ryan Tipton  16 Jacqueline Honerlaw  4 Laura Davies  17 Stacey Whitbourne  6   14   18 Jeremy Cohen  11 Daniel C Posner  4 Rahul Sangar  14 Michael Murray  14 Xuan Wang  12 Daniel R Dochtermann  19 Poornima Devineni  19 Yunling Shi  19 Tarak Nath Nandi  7 Themistocles L Assimes  20 Charles A Brunette  21   6 Robert J Carroll  22 Royce Clifford  23   24 Scott Duvall  25   26 Joel Gelernter  27   28 Adriana Hung  29 Sudha K Iyengar  30 Jacob Joseph  31   32 Rachel Kember  33   34 Henry Kranzler  33   34 Daniel Levey  35   27 Shiuh-Wen Luoh  36   37 Victoria C Merritt  23 Cassie Overstreet  27 Joseph D Deak  38   39 Struan F A Grant  40   41   42   43 Renato Polimanti  38 Panos Roussos  44 Yan V Sun  45 Sanan Venkatesh  44 Georgios Voloudakis  44 Amy Justice  35   46   47 Edmon Begoli  48 Rachel Ramoni  49 Georgia Tourassi  50 Saiju Pyarajan  19 Philip S Tsao  20   51 Christopher J O'Donnell  52 Sumitra Muralidhar  49 Jennifer Moser  49 Juan P Casas  4 Alexander G Bick  53 Wei Zhou  54   55   56 Tianxi Cai  12 Benjamin F Voight  1   8   43   57 Kelly Cho  6   14   18 Michael J Gaziano  6   14   18 Ravi K Madduri  7 Scott M Damrauer  1   43   58   59 Katherine P Liao  60   61
Affiliations

Diversity and Scale: Genetic Architecture of 2,068 Traits in the VA Million Veteran Program

Anurag Verma et al. medRxiv. .

Update in

  • Diversity and scale: Genetic architecture of 2068 traits in the VA Million Veteran Program.
    Verma A, Huffman JE, Rodriguez A, Conery M, Liu M, Ho YL, Kim Y, Heise DA, Guare L, Panickan VA, Garcon H, Linares F, Costa L, Goethert I, Tipton R, Honerlaw J, Davies L, Whitbourne S, Cohen J, Posner DC, Sangar R, Murray M, Wang X, Dochtermann DR, Devineni P, Shi Y, Nandi TN, Assimes TL, Brunette CA, Carroll RJ, Clifford R, Duvall S, Gelernter J, Hung A, Iyengar SK, Joseph J, Kember R, Kranzler H, Kripke CM, Levey D, Luoh SW, Merritt VC, Overstreet C, Deak JD, Grant SFA, Polimanti R, Roussos P, Shakt G, Sun YV, Tsao N, Venkatesh S, Voloudakis G, Justice A, Begoli E, Ramoni R, Tourassi G, Pyarajan S, Tsao P, O'Donnell CJ, Muralidhar S, Moser J, Casas JP, Bick AG, Zhou W, Cai T, Voight BF, Cho K, Gaziano JM, Madduri RK, Damrauer S, Liao KP. Verma A, et al. Science. 2024 Jul 19;385(6706):eadj1182. doi: 10.1126/science.adj1182. Epub 2024 Jul 19. Science. 2024. PMID: 39024449

Abstract

Genome-wide association studies (GWAS) have underrepresented individuals from non-European populations, impeding progress in characterizing the genetic architecture and consequences of health and disease traits. To address this, we present a population-stratified phenome-wide GWAS followed by a multi-population meta-analysis for 2,068 traits derived from electronic health records of 635,969 participants in the Million Veteran Program (MVP), a longitudinal cohort study of diverse U.S. Veterans genetically similar to the respective African (121,177), Admixed American (59,048), East Asian (6,702), and European (449,042) superpopulations defined by the 1000 Genomes Project. We identified 38,270 independent variants associating with one or more traits at experiment-wide P<4.6×10-11 significance; fine-mapping 6,318 signals identified from 613 traits to single-variant resolution. Among these, a third (2,069) of the associations were found only among participants genetically similar to non-European reference populations, demonstrating the importance of expanding diversity in genetic studies. Our work provides a comprehensive atlas of phenome-wide genetic associations for future studies dissecting the architecture of complex traits in diverse populations.

PubMed Disclaimer

Conflict of interest statement

Competing interests CJD and JPC are employed full-time by the Novartis Institute of Biomedical Interest (their major contributions to this project were while employed at VA Boston Healthcare System). H.K. is a member of advisory boards for Dicerna Pharmaceuticals, Sophrosyne Pharmaceuticals; Enthion Pharmaceuticals; and Clearmind Medicine; a consultant to Sobrera Pharmaceuticals; the recipient of research funding and medication supplies for an investigator-initiated study from Alkermes member of the American Society of Clinical Psychopharmacology’s Alcohol Clinical Trials Initiative; which was supported in the last three years by Alkermes, Dicerna, Ethypharm, Lundbeck, Mitsubishi, Otsuka, and Pear Therapeutics; and holder of U.S. patent 10,900,082 titled: “Genotype-guided dosing of opioid agonists,” issued 26 January 2021. JG and RP are paid for their editorial work in the journal Complex Psychiatry. RP reports a research grant from Alkermes. SD reports grants from Alnylam Pharmaceuticals, Inc, grants from Astellas Pharma, Inc; grants from AstraZeneca Pharmaceuticals LP; grants from Biodesix, grants from Celgene Corporation; grants from Cerner Enviza; grants from GlaxoSmithKline PLC, grants from Janssen Pharmaceuticals, Inc.; grants from Kantar Health; grants from Myriad Genetic Laboratories, Inc.; grants from Novartis International AG; grants from Parexel International Corporation through the University of Utah or Western Institute for Veteran Research outside the submitted work. SMD receives research support from RenalytixAI and Novo Nordisk, outside the scope of the current research, and is named as a co-inventor on a Government-owned US Patent application related to the use of genetic risk prediction for venous thromboembolic disease and for the use of PDE3B inhibition for preventing cardiovascular disease, both filed by the US Department of Veterans Affairs in accordance with Federal regulatory requirements.

Figures

Fig. 1
Fig. 1. Overview of the study population, genetic association results, and post-GWAS findings.
The top panel describes the demographic characteristics of the study population. The middle panel features two scatter plots: the left plot illustrates the number of participants (y-axis) with each binary trait grouped by category(x-axis), and the right plot presents the sample (y-axis) across quantitative traits (x-axis). Distinct colors and shapes represent population groups. The bottom panel is organized into three sections: the left section summarizes the study data, the middle section provides key metrics of GWAS results, such as the count of independent loci and lead SNPs, and the right section briefly outlines the post-GWAS findings.
Fig. 2
Fig. 2. Multi-population genetic associations with traits.
Combined multi-trait Manhattan plots and bar plots summarizing 98,485 associations with lead SNPs for binary and quantitative traits (p-value<4.6 × 10−11). Manhattan plots for (a) binary traits and (b) quantitative traits display associations across chromosomes (x-axis) and −log10 P values (y-axis). Circles represent previously reported associations, while triangles indicate novel trait associations. The size of the triangles corresponds to the effect size, with upward triangles denoting risk associations and downward triangles signifying protective associations. On the top, gene names indicate previously reported variant-trait associations (in black) and new trait associations (in pink). Stacked bar plot for (c) binary traits and (d) quantitative traits enumerate the number of associations with lead SNPs across different trait categories. The left panel presents the count of known associations (green), novel variants-trait associations (blue), and novel variants (light blue). Trait categories are ordered by the number of lead SNPs in descending order. The right panel is a dodged bar plot highlighting associations with lead SNPs based on their MAF categories: common variants (lower opacity) and low-frequency variants (higher opacity). The distribution of known associations (green) and novel SNPs (light blue) is shown for each trait category.
Fig. 3
Fig. 3. Multi-Population Fine-Mapped Signals.
(a) Upset plot of cross-population signal sharing for the 57,601 fine-mapped signals. Red colored-portions of bars represent signals where one or more variants show a suggestive association P<1×103 in an unmapped population, and blue colored-portions represent signals where the unmapped populations were underpowered to detect suggestive associations for any of the variants in the merged approximate credible set. Signal counts are displayed above the bars for intersections in which fewer than 1,000 signals were identified. (b) Scatter plot of the number of signals detected per phenotype vs. the meta-analyzed sample size for the phenotype. Effective sample sizes were used for binary phenotypes, and points are colored by the phenotype category. (c) The distribution of merged approximate credible set sizes for the fine-mapped signals. (d) Coding enrichment across increasingly precise fine-mapped signals. Bars are colored by the proportion of each represented by each grouped Variant Effect Predictor (VEP) annotation, and the black boxes illustrate the proportion of each bar attributable to coding variation. P-values reflect the results of pairwise Wilcoxon tests for coding annotation enrichment. (e) Distribution of effect sizes vs. minor allele frequencies for high-confidence (PIP > 0.95) associations fine-mapped in quantitative (top) and binary (bottom) phenotypes. Each point represents a unique high-confidence variant-phenotype-population mapping. Point colors reflect the population in which they are mapped, and their shapes reflect whether they are a phenotype association previously reported in the GWAS catalog (square), a novel phenotype association for a variant previously reported in the catalog (triangle), or a novel SNP and association not previously reported (circle). Inset bar plots reflect the proportions of high-confidence associations in these three categories across the four tested populations.
Fig. 4
Fig. 4. Fine-mapped cross-trait associations.
Chromosome ideogram illustrating highconfidence cross-trait associations (PIP > 0.95) between genetic variants and independent traits. Independence was determined using a phenotypic correlation coefficient threshold of 0.2 and a stepwise selection process accounting for the largest PIP value, meta-analysis p-value, population-specific p-values, effect size, and trait specificity (e.g., gender-specific traits). The ideogram highlights genomic regions associated with three or more traits, including both previously reported associations and novel trait associations discovered in the present study. Colors represent distinct trait categories, while shapes distinguish between previously reported (circle) and novel trait associations (triangle). Each data point on the ideogram indicates that at least one trait within the respective category is associated with the genomic region.

References

    1. Mills M. C., Rahal C., The GWAS Diversity Monitor tracks diversity by disease in real time. Nat. Genet. 52, 242–243 (2020). - PubMed
    1. Martin A. R., Kanai M., Kamatani Y., Okada Y., Neale B. M., Daly M. J., Clinical use of current polygenic risk scores may exacerbate health disparities. Nat. Genet. 51, 584–591 (2019). - PMC - PubMed
    1. Pulit S. L., Voight B. F., de Bakker P. I. W., Multiethnic genetic association studies improve power for locus discovery. PloS One. 5, e12600 (2010). - PMC - PubMed
    1. Genovese G., Friedman D. J., Ross M. D., Lecordier L., Uzureau P., Freedman B. I., Bowden D. W., Langefeld C. D., Oleksyk T. K., Uscinski Knob A. L., Bernhardy A. J., Hicks P. J., Nelson G. W., Vanhollebeke B., Winkler C. A., Kopp J. B., Pays E., Pollak M. R., Association of trypanolytic ApoL1 variants with kidney disease in African Americans. Science. 329, 841–845 (2010). - PMC - PubMed
    1. Cohn E. G., Henderson G. E., Appelbaum P. S., Distributive justice, diversity, and inclusion in precision medicine: what will success look like? Genet. Med. 19, 157–159 (2017). - PMC - PubMed

Publication types

LinkOut - more resources