Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2024 Dec;56(12):2672-2684.
doi: 10.1038/s41588-024-01972-8. Epub 2024 Nov 11.

Proteogenomic analysis of human cerebrospinal fluid identifies neurologically relevant regulation and implicates causal proteins for Alzheimer's disease

Affiliations

Proteogenomic analysis of human cerebrospinal fluid identifies neurologically relevant regulation and implicates causal proteins for Alzheimer's disease

Daniel Western et al. Nat Genet. 2024 Dec.

Abstract

The integration of quantitative trait loci (QTLs) with disease genome-wide association studies (GWASs) has proven successful in prioritizing candidate genes at disease-associated loci. QTL mapping has been focused on multi-tissue expression QTLs or plasma protein QTLs (pQTLs). We generated a cerebrospinal fluid (CSF) pQTL atlas by measuring 6,361 proteins in 3,506 samples. We identified 3,885 associations for 1,883 proteins, including 2,885 new pQTLs, demonstrating unique genetic regulation in CSF. We identified CSF-enriched pleiotropic regions on chromosome (chr)3q28 near OSTN and chr19q13.32 near APOE that were enriched for neuron specificity and neurological development. We integrated our associations with Alzheimer's disease (AD) through proteome-wide association study (PWAS), colocalization and Mendelian randomization and identified 38 putative causal proteins, 15 of which have drugs available. Finally, we developed a proteomics-based AD prediction model that outperforms genetics-based models. These findings will be instrumental to further understand the biology and identify causal and druggable proteins for brain and neurological traits.

PubMed Disclaimer

Conflict of interest statement

Competing interests: C.C. has received research support from GSK and Eisai. The funders of the study had no role in the collection, analysis or interpretation of data; in the writing of the report; or in the decision to submit the paper for publication. C.C. is a member of the advisory board of Circular Genomics and owns stocks in this company. D.J.P. is an employee of GSK and holds stock in GSK. M.d.C.M. has been an invited speaker at Eisai. M.d.C.M. is an associate editor at Alzheimer’s Research and Therapy. B.T. and P.J.V. are inventors on a patent (WO2020197399A1, owned by Stichting VUmc). C.E.T. has a collaboration contract with ADx Neurosciences, Quanterix and Eli Lilly and performed contract research or received grants from AC Immune, Axon Neuroscience, BioConnect, Bioorchestra, Brainstorm Therapeutics, Celgene, EIP Pharma, Eisai, Grifols, Novo Nordisk, PeopleBio, Roche, Toyama and Vivoryon. She serves on editorial boards of Medidact Neurologie–Springer, Alzheimer’s Research and Therapy and Neurology: Neuroimmunology and Neuroinflammation and is an editor of the Neuromethods book (Springer). She had speaker contracts for Roche, Grifols and Novo Nordisk. The rest of the authors declare no competing interest.

Figures

Fig. 1:
Fig. 1:. Study Design.
PPMI: Parkinson’s Progression Markers Initiative; ADNI: Alzheimer’s Disease; Neuroimaging Inititiative; DIAN: Dominantly Inherited Alzheimer’s Network; FACE: Fundació Ace; Knight-ADRC: Knight-ADRC Memory and Aging Project; MARS: Washington University Movement Disorder clinic; VEP: Variant Effect Predictor
Fig. 2:
Fig. 2:. Cerebrospinal fluid pQTLs are consistent across disease and largely tissue and molecule-specific.
a. Combined Manhattan plot of the pQTL associations identified from linear regression analysis of 3,506 CSF samples and 7,008 aptamers. X-axis: genome position of an associated variant; y-axis: −log10(p-value) for association of each SNP with an aptamer. b. 2D Manhattan plot of 2,477 index pQTLs for 2,042 aptamers that were significant in the joint analysis. X-axis: genome position of the pQTL signal; y-axis: location of the protein-coding gene corresponding to the aptamer with a pQTL. Color represents cis or trans status of the index variant (cis, blue; trans, green). The top panel maps the pleiotropic regions of the genome (limited to 100 proteins with an association in the same region). c. Scatter plot of index pQTL variant absolute effect size (y-axis) vs effect allele frequency (x-axis). Color represents cis or trans status of the index variant (cis, blue; trans, green). Correlation was calculated using the Pearson method with two-sided P-values (cis P=1.36×10−73, trans P=2.97×10−115). d. Scatter plot of cis index pQTL −log10(P) (y-axis) vs distance from the transcription start site of the gene encoding the associated protein (x-axis). Color represents the minor allele frequency (darker=less common). e. Scatter plot of effect size of index pQTL variants in dichotomized amyloid/tau positive samples (x-axis) vs dichotomized amyloid/tau negative samples (y-axis). Color represents minor allele frequency (darker=less common). Correlation was calculated using the Pearson method with two-sided P estimated to be below the underflow value. f. Colocalization of CSF pQTL associations with plasma pQTL associations from Ferkingstad et al. Plasma + CSF: pQTL associations that colocalized across tissues. CSF-specific: pQTL associations that did not colocalize with plasma pQTLs for the same protein. New Proteins: pQTL associations for proteins measured in CSF but not plasma. Color represents cis or trans status of the index CSF pQTL (cis, blue; trans, green). g. Colocalization of cis CSF pQTL associations with various QTL types. Green bars represent brain-relevant tissues, while blue bars represent other tissues. Bar labels represent the number of colocalizing QTLs.
Fig. 3:
Fig. 3:. Three pleiotropic regions make up hotspots of protein regulation involved in neurological processes.
a. Circos plot showing the genomic locations of all protein-coding genes whose proteins are regulated by the chr3q28 pleiotropic region. Colors represent the predominant brain-relevant cell type corresponding to that protein, as detailed in b. b. Enrichment of proteins associated with the chr3 region in brain-relevant cell types (based on classification shown in a). Fold change was calculated based on the number of cell type-specific proteins in the region compared to the number in the entire SOMAscan7k panel. Colors match those shown in a. Enrichment p-value was calculated using a one-sided hypergeometric test. c. Selected pathways enriched for proteins associated with the chr3 region. Gene Ratio represents the proportion of all proteins associated with the region that are part of each pathway. d. PheWAS of regression p-values of index pQTL SNPs located in the chr3 region, as found in the GWAS catalog. P-values were directly obtained from the GWAS catalog. e. Circos plot showing the genomic locations of all protein-coding genes whose proteins are associated with the chr6p22.2-21.32 pleiotropic region. f. Enrichment of proteins associated with the chr6 region in brain-relevant cell types. Enrichment p-value was calculated using a one-sided hypergeometric test. g. Selected pathways enriched for proteins associated with the chr6 region. h. Selected regression p-values for traits and diseases associated with index pQTL SNPs located in the chr6 region, as determined by the GWAS catalog. P-values were directly obtained from the GWAS catalog. i. Circos plot showing the genomic locations of all protein-coding genes whose proteins are regulated by the chr19q13.32 pleiotropic region. j. Enrichment of proteins associated with the chr19 pleiotropic region in brain-relevant cell types. Enrichment p-value was calculated using a one-sided hypergeometric test. k. Selected pathways enriched for proteins associated with the chr19 region. l. Selected regression p-values for traits and diseases associated with index pQTL SNPs located in the chr19 region, as determined by the GWAS catalog. P-values were directly obtained from the GWAS catalog.
Fig. 4:
Fig. 4:. AD-related proteins are enriched in microglia and immune-relevant pathways.
a. Upset plot of protein overlap between PWAS, Colocalization, and Mendelian Randomization (MR) after removal of all associations in pleiotropic regions. b. Miami plot of proteins significant in at least two of PWAS, COLOC, and MR. X-axis: chromosome position of the transcription start site of the protein coding gene; y-axis: PWAS −log10(P) for association with AD calculated using FUSION. Red line: B&H FDR-corrected p-value threshold. Top plot: Positive association with AD; Bottom plot: negative association with AD. Triangle-shaped points correspond to the 38 proteins prioritized in (a). Color: predominant brain-relevant cell type for that protein. c. Circos plot showing trans associations linked to AD through two or more methods. Proteins and links labeled in red or blue are associated with AD through a trans association (red, positively associated; blue, negatively associated). Links start at the TSS of the associated protein-coding gene and end at the index pQTL variant. Associations are labeled in black by the proposed gene at the pQTL. d. Details of the 38 proteins that overlap between at least two of PWAS, colocalization, and MR. Major cell type: Predominant brain-relevant cell type for proteins of interest. Cis/trans pQTL: type of pQTL driving the AD association. Drug Target: Proteins targeted by a molecule described in the DrugBank database. e. Enrichment using a one-sided hypergeometric test of 38 proteins from d in brain-relevant cell types. ***: P=1.38×10−4; *: P=0.013. f. Selected gene sets enriched for proteins from d. Gene Ratio: proportion of the 38 proteins/genes that are part of the pathway. g. Prediction of case/control status in testing dataset through protein-based (ProtRS), PRS-based (PRS + Age + Sex), and covariate-based (APOE + Age + Sex) models & prediction of amyloid/tau positivity using the ProtRS model.
Fig. 5:
Fig. 5:. Cellular localization of immune and lysosomal proteins.
a. Involvement of 16 AD-associated immune-related proteins (IL34, PILRA, TREM2, LGALS3, APOE, CD33, SHARPIN, C1S, CR1, CR2, SIRPA, CD72, LILRB1, IL12A, FCGR3B, and SIGLEC9) in microglia signaling pathways. AD-prioritized proteins are highlighted in black text. b. Involvement of six lysosomal proteins (CLN5, EGFR, TMEM106B, GRN, CTSH, CST8) in various components of the lysosomal processing system. AD-prioritized proteins are highlighted in black text. Cell structure images in b were obtained from Servier via the open-source resource BioIcons.

Update of

References

    1. Yengo L et al. A saturated map of common genetic variants associated with human height. Nature 610, 704–712 (2022). - PMC - PubMed
    1. Fernandez-Rozadilla C et al. Deciphering colorectal cancer genetics through multi-omic analysis of 100,204 cases and 154,587 controls of European and east Asian ancestries. Nature Genetics (2022). - PMC - PubMed
    1. Tcheandjieu C. et al. Large-scale genome-wide association study of coronary artery disease in genetically diverse populations. Nature Medicine 28, 1679–1692 (2022). - PMC - PubMed
    1. Consortium TG The GTEx Consortium atlas of genetic regulatory effects across human tissues. Science, 1318–1330 (2020). - PMC - PubMed
    1. Võsa U et al. Large-scale cis- and trans-eQTL analyses identify thousands of genetic loci and polygenic scores that regulate blood gene expression. Nature Genetics 53, 1300–1310 (2021). - PMC - PubMed