Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Meta-Analysis
. 2025 Sep 9;26(1):273.
doi: 10.1186/s13059-025-03698-0.

Whole genome sequence analysis of low-density lipoprotein cholesterol across 246 K individuals

Margaret Sunitha Selvaraj  1   2   3 Xihao Li  4   5 Zilin Li  6 Eric Van Buren  6 Sara Haidermota  1   2 Darina Postupaka  1   2 Whitney Hornsby  1   2 Joshua C Bis  7 Jennifer A Brody  7 Brian E Cade  8   9 Ren-Hua Chung  10 Joanne E Curran  11 Scott M Damrauer  12   13   14 Lisa de Las Fuentes  15 Paul S de Vries  16 Ravindranath Duggirala  17 Barry I Freedman  18 MariaElisa Graff  19 Xiuqing Guo  20 Bertha A Hidalgo  21 Lifang Hou  22 Ryan Irvin  23 Renae Judy  12 Rita R Kalyani  24 Tanika N Kelly  25 Iain R Konigsberg  26 Brian G Kral  24 Lydia Coulter Kwee  27 Daniel Levy  28   29 Changwei Li  30 Ani W Manichaikul  31 Lisa Warsinger Martin  32 May E Montasser  33 Alanna C Morrison  16 Take Naseri  34   35 Kari E North  36 Jeffrey R O'Connell  33 Nicholette D Palmer  37 Patricia A Peyser  38 Alex P Reiner  39 Svati H Shah  27 Roelof A J Smit  40   41 Jennifer A Smith  38   42 Kent D Taylor  20 Hemant Tiwari  43 Michael Y Tsai  44 Satupa'itea Viali  45   46   47 Zhe Wang  23   40 Yuxuan Wang  48 Wei Zhao  38   42 Donna K Arnett  49 John Blangero  11 Eric Boerwinkle  16 Donald W Bowden  37 Jenna C Carlson  50 Yii-Der Ida Chen  20 Patrick T Ellinor  2 Myriam Fornage  51 Jiang He  30 Nancy Heard-Costa  28   52 Robert C Kaplan  19 Sharon L R Kardia  38 Charles Kooperberg  39 William E Kraus  27 Leslie A Lange  26 Ruth J F Loos  40   41 Braxton D Mitchell  33   53 Bruce M Psaty  7   54   55 Daniel J Rader  14   56   57 Susan Redline  8   9 Stephen S Rich  31 Lisa R Yanek  24 Richard Gibbs  58 Stacey Gabriel  59 Karine A Viaud-Martinez  60 Susan K Dutcher  61 Soren Germer  62 Ryan Kim  63 Jerome I Rotter  20 Xihong Lin  6 Gina M Peloso #  64 NHLBI Trans-Omics for Precision Medicine (TOPMed) ConsortiumPradeep Natarajan #  65   66   67
Collaborators, Affiliations
Meta-Analysis

Whole genome sequence analysis of low-density lipoprotein cholesterol across 246 K individuals

Margaret Sunitha Selvaraj et al. Genome Biol. .

Abstract

Background: Rare genetic variation provided by whole genome sequence datasets has been relatively less explored for its contributions to human traits. Meta-analysis of sequencing data offers advantages by integrating larger sample sizes from diverse cohorts, thereby increasing the likelihood of discovering novel insights into complex traits. Furthermore, emerging methods in genome-wide rare variant association testing further improve power and interpretability.

Results: Here, we conduct the largest meta-analysis of whole genome sequencing for low-density lipoprotein cholesterol (LDL-C), a therapeutic target for coronary artery disease, analyzing data from 246 K participants and integrating 1.23B variants from the UK Biobank and the Trans-Omics for Precision Medicine (TOPMed) program. We identify numerous rare coding and non-coding gene associations related to LDL-C, with replication across 86 K participants in All of Us. Our findings are based on single-variant analyses, rare coding and non-coding variant aggregation tests, and sliding window approaches. Through this comprehensive analysis, we identify 704 novel single-variant associations, 25 novel rare coding variant aggregates, 28 novel rare non-coding variant aggregates, and one novel sliding window aggregate.

Conclusions: This study provides a meta-analysis framework for large-scale whole genome sequence association analyses from diverse population groups, yielding novel rare non-coding variant associations.

PubMed Disclaimer

Conflict of interest statement

Declarations. Ethics approval and consent to participate: For the TOPMed cohort, study participants provided consent per each study’s Institutional Review Board (IRB)-approved protocol. The TOPMed data analysis is associated with paper proposal ID 15536. For UKB participants, written informed consent was given per the UKB primary protocol. UK Biobank data analysis was facilitated through UKB application 7089. For the AOU cohort, written informed consent was provided in accordance with the primary Institutional Review Board for AOU. AOU data analysis was facilitated through the AOU Researcher Workbench. Secondary use of UK Biobank, TOPMed, and AOU data was approved by the Massachusetts Hospital Institutional Review Board. Consent for publication: All participants provided informed consent for publication. Competing interests: P.N. reports research grants from Allelica, Amgen, Apple, Boston Scientific, Cleerly, Genentech / Roche, Ionis, Novartis, and Silence Therapeutics, personal fees from AIRNA, Allelica, Apple, AstraZeneca, Bain Capital, Blackstone Life Sciences, Bristol Myers Squibb, Creative Education Concepts, CRISPR Therapeutics, Eli Lilly & Co, Esperion Therapeutics, Foresite Capital, Foresite Labs, Genentech / Roche, GV, HeartFlow, Magnet Biomedicine, Merck, Novartis, Novo Nordisk, TenSixteen Bio, and Tourmaline Bio, equity in Bolt, Candela, Mercury, MyOme, Parameter Health, Preciseli, and TenSixteen Bio, royalties from Recora for intensive cardiac rehabilitation, and spousal employment at Vertex Pharmaceuticals, all unrelated to the present work. The remaining authors declare no competing interests.

Figures

Fig. 1
Fig. 1
Overall results from TOPMed and UKB WGS meta-analysis. A Procedure used to identify novel variants from individual variant GWAS: A total of 21,657 variants were genome significant in the individual variant meta-analysis with a P-value ≤ 5 × 10−09. These variants were used to define genomic risk locus using FUMA with 1000 Genome phase 3 as reference panel, where independent SNPs (r2 ≥ 0.6) were identified and if LD blocks of independent significant SNPs are closely located to each other (< 250 kb based on the most right and left SNPs from each LD block), they are merged into one genomic locus. Finally, we identified 128 genomic loci from our individual variant GWAS. Comparison of these genomic risk loci with the GLGC summary statistics yielded one locus with a genome-significant variant associated to LDL-C. From the totality of the genome, significant variants identified in the present study, 704 were unique to WGS data and not found in the GLGC summary statistics. B Procedure used to identify novel aggregates from rare variant test: From a total of 20 K genes, rare variant aggregates were assessed for coding (5-masks) and non-coding (7-masks) and 2.6 M regions based on the sliding-window approach. Bonferroni-corrected p-values were used to identify genome-significant rare variant aggregates before and after conditional analysis. The set of aggregates that passed the conditional analysis were replicated in an independent cohort (i.e., AoU). The number of rare variant aggregates passing each step is shown. WGS—Whole Genome Sequencing; GLGC—Global Lipids Genetics Consortium; SNP—Single-Nucleotide Polymorphism; GWAS—Genome-Wide Association Studies; LD—Linkage Disequilibrium; LDL-C—Low-Density Lipoprotein Cholesterol
Fig. 2
Fig. 2
MetaSTAAR-O p-value comparison. A MetaSTAAR-O p-value comparison between before and after conditional analysis for gene-centric coding aggregates. B MetaSTAAR-O p-value comparison between before and after conditional analysis for gene-centric non-coding aggregates. In both the plots, the aggregates are ordered based on conditional MetaSTAAR p-value. C Comparison of MetaSTAAR-O p-value for genes that have at least one coding and non-coding signal before conditional analysis. Each dot represents a gene, and the color of the dots represent if the gene is genome significant in either coding or non-coding or both. Most significant gene names each of those categories are mapped
Fig. 3
Fig. 3
ABCA6 protein structure with variants identified using the aPC scores. A Protein structure of ABCA6 consists of 2 ABC transporter and transmembrane domains. B Structure of ABCA6 with the 39 SNVs identified from ABCA6-pLoF-ds aggregate set (red) and 30 SNVs identified from ABCA6-missense aggregate set (orange) and the common SNVs mapping to both aggregate sets (pink). C Structure of ABCA6 with the highly scored variants (red) and structural proximity of the stretch of amino acids in ABC transporter 1 domain identified through this analysis. ABC—ATP-binding cassette; SNV—Single- Nucleotide Variations; pLoF-ds—putative loss-of-function and disruptive missense variants

References

    1. Tsao CW, Aday AW, Almarzooq ZI, Alonso A, Beaton AZ, Bittencourt MS, et al. Heart disease and stroke statistics—2022 update: A report from the American Heart Association. Circulation 2022 Feb 22 [cited 2024 Mar 13];145(8). Available from: https://pubmed.ncbi.nlm.nih.gov/35078371/. - PubMed
    1. Goldstein JL, Brown MS. The LDL receptor. Arterioscler Thromb Vasc Biol. 2009Apr;29(4):431–8. - PMC - PubMed
    1. Khera AV, Chaffin M, Zekavat SM, Collins RL, Roselli C, Natarajan P, et al. Whole-genome sequencing to characterize monogenic and polygenic contributions in patients hospitalized with early-onset myocardial infarction. Circulation. 2019Mar 26;139(13):1593–602. - PMC - PubMed
    1. Abifadel M, Boileau C. Genetic and molecular architecture of familial hypercholesterolemia. J Intern Med. 2023Feb;293(2):144–65. - PMC - PubMed
    1. Cohen JC, Boerwinkle E, Mosley TH Jr, Hobbs HH. Sequence variations in PCSK9, low LDL, and protection against coronary heart disease. N Engl J Med. 2006Mar 23;354(12):1264–72. - PubMed

Publication types

Substances

LinkOut - more resources