Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2022 Mar 22;12(4):277.
doi: 10.3390/metabo12040277.

Predictive Modeling of Alzheimer's and Parkinson's Disease Using Metabolomic and Lipidomic Profiles from Cerebrospinal Fluid

Affiliations

Predictive Modeling of Alzheimer's and Parkinson's Disease Using Metabolomic and Lipidomic Profiles from Cerebrospinal Fluid

Nathan Hwangbo et al. Metabolites. .

Abstract

In recent years, metabolomics has been used as a powerful tool to better understand the physiology of neurodegenerative diseases and identify potential biomarkers for progression. We used targeted and untargeted aqueous, and lipidomic profiles of the metabolome from human cerebrospinal fluid to build multivariate predictive models distinguishing patients with Alzheimer's disease (AD), Parkinson's disease (PD), and healthy age-matched controls. We emphasize several statistical challenges associated with metabolomic studies where the number of measured metabolites far exceeds sample size. We found strong separation in the metabolome between PD and controls, as well as between PD and AD, with weaker separation between AD and controls. Consistent with existing literature, we found alanine, kynurenine, tryptophan, and serine to be associated with PD classification against controls, while alanine, creatine, and long chain ceramides were associated with AD classification against controls. We conducted a univariate pathway analysis of untargeted and targeted metabolite profiles and find that vitamin E and urea cycle metabolism pathways are associated with PD, while the aspartate/asparagine and c21-steroid hormone biosynthesis pathways are associated with AD. We also found that the amount of metabolite missingness varied by phenotype, highlighting the importance of examining missing data in future metabolomic studies.

Keywords: biomarker; cerebrospinal fluid; cross-sectional study; neurodegenerative disease; predictive modeling.

PubMed Disclaimer

Conflict of interest statement

The authors declare no conflict of interest. The funders had no role in the design of the study; in the collection, analyses, or interpretation of data; in the writing of the manuscript, or in the decision to publish the results.

Figures

Figure A1
Figure A1
(a) Distribution of subject age, split by phenotype; (b) a comparison of missingness between profiles, split by phenotype.
Figure A2
Figure A2
Flowchart outlining the analysis performed on each of the three profiles.
Figure A3
Figure A3
Distribution of sample collection date, split by phenotype.
Figure A4
Figure A4
For metabolites with widely varying amounts of abundance missingness by phenotype (AD, PD, age-matched controls), PD tend to have the least missing data. Plotted is the percent missingness of untargeted metabolites for which univariate logistic regressions classifying its missingness across subjects using phenotype contained a FDR < 0.05. The controls used for this analysis were found by matching each member of the AD/PD cohort to the control of closest age, removing duplicates. Metabolites are listed by their retention time, neutral mass (DA), and mode, with each value separated by an underscore.
Figure A5
Figure A5
Precision–Recall Curves for binomial elastic net regressions classifying (a) controls against subjects with AD; (b) controls against subjects with PD; and (c) subjects with AD against subjects with PD, using the same predictions as Figure 2. Each line represent models formed using one of the five missing data imputed datasets. The average Area Under the Curve (AUC) is displayed in the bottom right.
Figure A6
Figure A6
ROC curves displaying leave one out predictive accuracy of models including age and sex rather than detrending each metabolite. Age and sex are not penalized in the elastic net procedure to reflect the known associations between age and sex with AD and PD. As expected, we generally find that the predictive performance of these models is slightly better than those presented in the main analysis.
Figure A7
Figure A7
ROC curves displaying leave one out predictive accuracy of models fit on the targeted profile after the batch orthogonalization procedure. We find that these models are largely unable to distinguish AD from controls after this procedure (a), and are still able to distinguish PD from controls (but with less predictive accuracy compared to the results in the main analysis) (b).
Figure A8
Figure A8
Comparison of univariate p-values before and after orthogonalization, using univariate logistic regressions fit on the targeted profile to classify PD against controls. p-values are Benjamini–Hochberg corrected within each dataset, with a log10 transformation applied. The y=x line is also drawn, with values above the line representing metabolites which are more differentially abundant after orthogonalization. Points are colored according to false discovery rate cutoffs of 0.05, indicating whether a metabolite is considered differentially abundant in both analyses, only one analysis, or in neither analysis.
Figure 1
Figure 1
Untargeted data projected onto the first two Principal Components (PC). Each point represents a subject, colored by their phenotype. Percentages in the axis titles refer to the percentage of variation of the data explained by the respective PC. In addition, 95% confidence ellipses assuming the t-distribution are also plotted. The first two principal components do not clearly separate the disease phenotypes.
Figure 2
Figure 2
Receiver Operating Characteristic (ROC) Curves for binomial elastic net regressions classifying (a) controls against subjects with AD, (b) controls against subjects with PD, and (c) subjects with AD against subjects with PD. Solid lines represent models formed using each of the five missing data imputed datasets. The dotted y=x line represents the ROC curve under a model which makes predictions at random. The average Area Under the Curve (AUC) across the five ROC curves is displayed in the bottom right.
Figure 3
Figure 3
Pathway Analysis of Mummichog on the positive mode untargeted metabolites from univariate logistic models classifying PD and AD against Controls, sorted by log10 Benjamini–Hochberg corrected p-values, with the vertical dashed line marking p=0.05.

References

    1. Heron M. Deaths: Leading Causes for 2017. Natl. Vital. Stat. Rep. 2019;68:1–77. - PubMed
    1. Marras C., Beck J.C., Bower J.H., Roberts E., Ritz B., Ross G.W., Abbott R.D., Savica R., Van Den Eeden S.K., Willis A.W., et al. Prevalence of Parkinson’s Disease across North America. NPJ Parkinson’s Disease. 2018;4:1–7. doi: 10.1038/s41531-018-0058-0. - DOI - PMC - PubMed
    1. Hebert L.E., Weuve J., Scherr P.A., Evans D.A. Alzheimer Disease in the United States (2010–2050) Estimated Using the 2010 Census. Neurology. 2013;80:1778–1783. doi: 10.1212/WNL.0b013e31828726f5. - DOI - PMC - PubMed
    1. Havelund J.F., Andersen A.D., Binzer M., Blaabjerg M., Heegaard N.H.H., Stenager E., Færgeman N.J., Gramsbergen J.B. Changes in Kynurenine Pathway Metabolism in Parkinson Patients with L-DOPA-Induced Dyskinesia. J. Neurochem. 2017;142:756–766. doi: 10.1111/jnc.14104. - DOI - PubMed
    1. Trushina E., Mielke M.M. Recent Advances in the Application of Metabolomics to Alzheimer’s Disease. Biochim. Biophys. Acta (BBA)-Mol. Basis Dis. 2014;1842:1232–1239. doi: 10.1016/j.bbadis.2013.06.014. - DOI - PMC - PubMed