Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2025 May;21(5):e70237.
doi: 10.1002/alz.70237.

Alzheimer's Disease Sequencing Project release 4 whole genome sequencing dataset

Affiliations

Alzheimer's Disease Sequencing Project release 4 whole genome sequencing dataset

Yuk Yee Leung et al. Alzheimers Dement. 2025 May.

Abstract

Introduction: The Alzheimer's Disease Sequencing Project (ADSP) is a national initiative to understand the genetic architecture of Alzheimer's disease and related dementias (ADRD) by integrating whole genome sequencing (WGS) with other genetic, phenotypic, and harmonized datasets from diverse populations.

Methods: The Genome Center for Alzheimer's Disease (GCAD) uniformly processed WGS from 36,361 ADSP samples, including 35,014 genetically unique participants of which 45% are from non-European ancestry, across 17 cohorts in 14 countries in this fourth release (R4).

Results: This sequencing effort identified 387 million bi-allelic variants, 42 million short insertions/deletions, and 6.8 million structural variants. Annotations and quality control data are available for all variants and samples. Additionally, detailed phenotypes from 15,927 participants across 10 domains are also provided. A linkage disequilibrium panel was created using unrelated AD cases and controls.

Discussion: Researchers can access and analyze the genetic data via the National Institute on Aging Genetics of Alzheimer's Disease Data Storage Site (NIAGADS) Data Sharing Service, the VariXam, or NIAGADS GenomicsDB.

Highlights: We detailed the genetic architecture and quality of the Alzheimer's Disease Sequencing Project release 4 whole genome sequences. We identified 435 million single nucleotide polymorphisms, insertions and deletions, and structural variants from diverse genomes. We harmonized extensive phenotypes, linkage disequilibrium reference panel on subset of samples. Data is publicly available at NIAGADS Data Storage Site, variants and annotations are browsable on two different websites.

Keywords: Alzheimer's disease; diversity; genetic architecture; genetics data sharing; genetics knowledgebase; linkage disequilibrium reference panel; whole genome sequencing.

PubMed Disclaimer

Conflict of interest statement

G.S. receives payment or honoraria for lectures, presentations, speakers’ bureaus, manuscript writing, or educational events by BrightFocus and USC. T.J.H. serves on the scientific advisory board for Vivid Genomics, and as deputy editor for Alzheimer's & Dementia: TRCI. Other authors have nothing to disclose. Author disclosures are available in the supporting information.

Figures

FIGURE 1
FIGURE 1
Participants in ADSP R4 dataset. A, Worldwide cohorts assembled for this ADSP R4 dataset. Non‐Hispanic Black with African Ancestry (NHB‐AA) samples are from Africa, and North America (Canada, United States); Asian and Asian American are from Asia, and North America (Canada, United States); Hispanic/Latino (HL) are from North America (Mexico/Caribbean, Canada, United States), and South America; non‐Hispanic White (NHW) are from Europe and North America (Canada, United States). Last, some samples categorized as others or unknown and they are from Australia. B, Comparison of reported ethnicity against those inferred by GRAF‐pop– and SCOPE–based methods. C, Estimated GRAF‐pop ancestral components Pe, Pf, and Pa for all participants. D, PCA plot on R4 participants colored by reported ethnicity (top) or SCOPE. ADSP, Alzheimer's Disease Sequencing Project; PCA, principal component analysis.
FIGURE 2
FIGURE 2
WGS sample quality. A, Coverage (30x) for the ADSP R4 data. Red dotted line indicates coverage value at 30.99% of samples pass this threshold. B, Number of SNVs called per sample in each reported ethnic group. Line in each displayed boxplot denotes the mean value where each dot is a sample. ADSP, Alzheimer's Disease Sequencing Project; SNV, single nucleotide variants; WGS, whole genome sequencing.
FIGURE 3
FIGURE 3
The distribution of variant types across the genome, with a specific focus on high‐risk loss‐of‐function variants. A, Bar chart depicting the breakdown of the total number of variants across the genome, categorized by genomic annotation as follows: insertions and deletions, loss‐of‐function variants, upstream gene variants, synonymous variants, non‐coding transcript exon variants, missense variants, intron variants, intragenic variants, intergenic variants, downstream variants, 5 prime UTR variants, and 3 prime UTR variants. B, The distribution of 224,594 loss‐of‐function variants is further broken down into the following categories: frameshift (39%), stop gained (27%), splice donor (16%), splice acceptor (12%), start lost (4%), and stop lost (2%). UTR, untranslated region.
FIGURE 4
FIGURE 4
Comparison of the number of SVs called in the ADSP R4 dataset across different reported ethnicities. SVs can be categorized into four different types of SVs: deletion (DEL), duplication (DUP), inversion (INV), and insertion (INS). ADSP, Alzheimer's Disease Sequencing Project; SV, single variant.
FIGURE 5
FIGURE 5
ADSP‐PHC Release (ng00067.v11) Sample sizes (“N” on the right) reflect individuals with ADSP sequencing data in R4. ADSP, Alzheimer's Disease Sequencing Project; PHC, Phenotype Harmonization Consortium.
FIGURE 6
FIGURE 6
Browser of variants and annotations of diversified samples. A, VariXam interface. A variant browser displaying all genomic variants identified in the ADSP whole genome and exome data across releases. The figure below shows the search results of APOE. Accessible at: https://varixam.niagads.org/. B, The R4 variants can be visually inspected as a track on the NIAGADS Genome Browser.  The track displays annotated short INDELS and SNVs that passed the biallelic QC criteria.  Track annotations include the most severe variant consequences and consequence impacts predicted by the ADSP annotation pipeline and mappings to dbSNP refSNP identifiers.  The track settings menu can be used to recolor the variants based on various annotations; the legend (made available by clicking on the track name) will update accordingly.  Users can zoom into regions of interest (here, the green rectangle highlights the region displayed in the close‐up inset) to view sequence information and click on individual variants for a brief summary of the annotations.  Full annotation results can be browsed by following the link to the GenomicsDB record for the variant. ADSP, Alzheimer's Disease Sequencing Project; APOE, apolipoprotein E; INDEL, insertion and deletion; NIAGADS, National Institute on Aging Genetics of Alzheimer's Disease Data Storage Site; QC, quality control; SNV, single nucleotide variant.

Update of

  • Alzheimer's Disease Sequencing Project Release 4 Whole Genome Sequencing Dataset.
    Leung YY, Lee WP, Kuzma AB, Nicaretta H, Valladares O, Gangadharan P, Qu L, Zhao Y, Ren Y, Cheng PL, Kuksa PP, Wang H, White H, Katanic Z, Bass L, Saravanan N, Greenfest-Allen E, Kirsch M, Cantwell L, Iqbal T, Wheeler NR, Farrell JJ, Zhu C, Turner SL, Gunasekaran TI, Mena PR, Jin J, Carter L; Alzheimer’s Disease Sequencing Project; Zhang X, Vardarajan BN, Toga A, Cuccaro M, Hohman TJ, Bush WS, Naj AC, Martin E, Dalgard C, Kunkle BW, Farrer LA, Mayeux RP, Haines JL, Pericak-Vance MA, Schellenberg GD, Wang LS. Leung YY, et al. medRxiv [Preprint]. 2024 Dec 6:2024.12.03.24317000. doi: 10.1101/2024.12.03.24317000. medRxiv. 2024. Update in: Alzheimers Dement. 2025 May;21(5):e70237. doi: 10.1002/alz.70237. PMID: 39677464 Free PMC article. Updated. Preprint.

Similar articles

Cited by

References

    1. Strittmatter WJ, Weisgraber KH, Huang DY, et al. Binding of human apolipoprotein E to synthetic amyloid beta peptide: isoform‐specific effects and implications for late‐onset Alzheimer disease. Proc Natl Acad Sci U S A. 1993;90(17):8098‐8102. - PMC - PubMed
    1. Harold D, Abraham R, Hollingworth P, et al. Genome‐wide association study identifies variants at CLU and PICALM associated with Alzheimer's disease. Nat Genet. 2009;41(10):1088‐1093. - PMC - PubMed
    1. Lambert J, Heath S, Even G, et al. Genome‐wide association study identifies variants at CLU and CR1 associated with Alzheimer's disease. Nat Genet. 2009;41(10):1094‐1099. - PubMed
    1. Bellenguez C, Küçükali F, Jansen IE, et al. New insights into the genetic etiology of Alzheimer's disease and related dementias. Nat Genet. 2022;54(4):412‐436. - PMC - PubMed
    1. Kunkle BW, Grenier‐Boley B, Sims R, et al. Genetic meta‐analysis of diagnosed Alzheimer's disease identifies new risk loci and implicates Abeta, tau, immunity and lipid processing. Nat Genet. 2019;51(3):414‐430. - PMC - PubMed

Grants and funding