Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
[Preprint]. 2025 Dec 8:2025.12.06.25341403.
doi: 10.64898/2025.12.06.25341403.

Predicting Phenoconversion to Clinically Manifest ALS: Results of a Large-Scale Proteomic Study

Affiliations

Predicting Phenoconversion to Clinically Manifest ALS: Results of a Large-Scale Proteomic Study

Ximing Ran et al. medRxiv. .

Abstract

The study of pre-symptomatic amyotrophic lateral sclerosis (ALS) and the design of disease prevention trials are greatly hampered by our inability to predict which unaffected carriers of ALS-associated pathogenic variants will phenoconvert to clinically manifest disease, and when. In this longitudinal Olink Explore high-throughput proteomic study, 516 serially collected plasma samples from 33 phenoconverters, 35 patients with ALS, 10 pre-symptomatic pathogenic variant carriers and 59 controls were included. We identified 81 proteins whose concentrations changed prior to phenoconversion; characterized the longitudinal trajectory of these proteins; and identified a core panel of 19 proteins that, collectively, predicted phenoconversion over the 0.5- to 5-year time horizons (areas under curve 0.80-0.89) and yielded estimates of time-to-phenoconversion with a mean absolute error of 1.6 years. These findings were replicated in UK Biobank data, confirming pre-symptomatic increases in several proteins (e.g. NEFL, EDA2R, CA3) and that a multi-protein panel outperformed NEFL alone in estimating time-to-phenoconversion. This work sheds light on the biology of pre-symptomatic ALS. Moreover, our identification of a panel of novel susceptibility/risk biomarkers based on empirical longitudinal data furthers the ultimate goal of ALS prevention.

PubMed Disclaimer

Conflict of interest statement

V.G. is an employee at Biohaven Pharmaceuticals, and he holds stocks and/or stock options in the company.A.M. reports contracts from My Name’5 Doddie Foundation, Target ALS, NIHR UCL Biomedical Research Centre, LifeArc, Medical Research Council, NIH and Motor Neurone Disease Association, Alan Davidson Foundation, Weston Family Foundation, EU H2020 programme; consulting fees from Pfizer, Novartis, LifeArc, Accure, Trace, Neuroscience. Licences to Biogen and ILTOO; and patent numbers, WO2021176044 A1 and WO2024121173A1.M.B. reports consulting fees from Alaunos, Alector, Alexion, Amgen, Annexon, Arrowhead, Biogen, Bristol Myers Squibb, Canopy, Cartesian, CorEvitas, Denali, Eli Lilly, Immunovant, Janssen, Merck, Novartis, Prilenia, Roche, Sanofi, Takeda, UCB, uniQure, Voyager, and Woolsey; and is an unpaid member of the Board of Trustees of the ALS Association. The University of Miami has licensed intellectual property to Biogen to support design of the ATLAS study.X.R. J.W., Z.S.Q. J.C.-K., A.L.G., Y.L., E.L., M.F.C., D.C., N.C., C.M.L. and P.P. report no competing interests.

Figures

Figure 1.
Figure 1.. Plasma markers differentially regulated in ALS vs. healthy controls
(A) Volcano plot illustrating differentially regulated proteins between clinically manifest ALS and healthy controls. Significantly upregulated proteins are shown in red, and significantly downregulated proteins are shown in blue. The dashed black line indicates the false discovery rate (FDR) threshold of 0.05. (B) The top 20 gene ontology (GO) terms that are most enriched among upregulated proteins. Dark red shading and bold font highlights GO terms associated with muscle function. The dashed black line indicates the FDR threshold of 0.05. (C) The top 20 GO terms that are most enriched among downregulated proteins. The dashed black line indicates the FDR threshold of 0.05. (D) Protein-protein interaction network of differentially regulated proteins, with clusters related to skeletal muscle (red), the extracellular matrix (olive), regeneration and neurofilament (green), and positive regulation of tumor necrosis factor (TNF)-mediated signaling (orange) highlighted. (E) Heatmap of the subset of 49 proteins with distinct expression patterns corresponding to the four clusters highlighted in (D). Columns represent data from individual study participants; rows indicate specific proteins.
Figure 2.
Figure 2.. Longitudinal trajectory of protein biomarkers before and after phenoconversion
(A) Heatmap depicting 81 proteins with significant changes in their relative abundance, as compared to controls, during the pre-symptomatic and clinically manifest stages of disease. The color scale reflects the direction and magnitude of change, with warmer colors (red) indicating increased concentrations and cooler colors (blue) indicating decreased concentration relative to controls. The number in each cell represents the average protein abundance, relative to controls, during the indicated time interval before or after phenoconversion. Empty cells denote time points where differences were not statistically significant. Proteins are grouped by functional modules (see Figure 1D), as indicated by the color bar in the first column. (B) Longitudinal trajectories of four proteins (NEFL, CALCA, EDA2R and CA3), one representing each functional module, to further illustrate relative protein abundance, in log2 fold change (log2FC), during the periods before and after phenoconversion. In addition to phenoconverters, participants with clinically manifest ALS were included; their estimated date of symptom onset was used as a proxy for date of phenoconversion. The dots and connecting lines show the longitudinal data from individual participants. The black curve depicts the estimated mean longitudinal trajectory of protein abundance, adjusting for age and sex; the shaded band denotes the 95% confidence interval around the fitted mean trend. Horizontal red lines indicate the time periods during which relative protein abundance is significantly elevated (i.e. log2FC>0).
Figure 3.
Figure 3.. Phenoconversion event prediction
(A) Receiver operating characteristic (ROC) curves comparing the performance of phenoconversion event prediction of 3 different models. Red lines correspond to prediction results obtained using NEFL alone (shading represents the 95% CI). Blue lines correspond to results obtained using the 19-protein panel. Green lines correspond to results obtained using the 15-protein panel (subset of 19 proteins that is also available in UK Biobank data). (B) Heatmap illustrating the area under the curve (AUC) of ROC obtained, via logistic regression, for each of the 77 proteins. Analyses are conducted across five different timeframes (0.5, 1, 2, 3 and 5 years). Each AUC value is obtained from 5-fold cross validation. Darker shades of blue indicate higher AUC. Proteins are grouped by functional modules (see Figure 1D), as indicated by the color bar in the first column. (C) UpSet plot of protein sets identified by logistic regression to predict phenoconversion over the five timeframes. Horizontal bar chart (lower left panel) summarizes the number of proteins included in the model for each timeframe, with NEFL and DUSP29 contributing to phenoconversion event prediction across all 5 timeframes.
Figure 4.
Figure 4.. Time-to-phenoconversion estimation
Scatterplots illustrating the relationship between observed and estimated (or predicted) time-to-phenoconversion for (A) the 19-protein panel, (B) the 15-protein panel (subset of 19 proteins that is also available in UK Biobank data), and (C) NEFL alone. Red dots indicate overestimation and blue dots indicate underestimation. RMSE = root mean square error (average error between predicted and observed); MAE = mean absolute error (average absolute difference between predicted and observed); Cor = Pearson correlation coefficient.
Figure 5.
Figure 5.. UK Biobank replication
(A) Scatterplot showing high correlation between discovery and UKB replication cohorts, for proteins that were differentially (up and down) regulated in clinically manifest ALS vs. healthy controls. (B) Pseudo-longitudinal (constructed using cross-sectional data) patterns of 7 proteins with relative abundance that deviates significantly from healthy controls. This heatmap included data from phenoconverters and those with clinically manifest ALS. The number in each cell represents the average relative protein abundance at the indicated time interval, with negative and positive time being, respectively, before and after phenoconversion. Empty cells denote time intervals where differences were not statistically significant. The color scale reflects the direction and magnitude of change, with warmer colors (red) indicating increased concentrations relative to controls and cooler colors (blue) indicating decreased concentration relative to controls. Proteins are grouped by functional modules (see Figure 1D), as indicated by the color bar in the first column. (C) Visual depiction of pseudo-longitudinal trajectories of two proteins of interest (NEFL and EDA2R) relative to sex- and age-matched controls. Horizontal red bars indicate the time periods during which the relative protein abundance is significantly elevated (i.e. log2FC > 0). The grey shaded areas represent the 95% confidence interval for the relative protein abundance. (D) Scatterplots illustrating the relationship between observed and estimated time-to-phenoconversion for the 15-protein panel and for NEFL alone in UKB data.

References

    1. Simuni T, Chahine LM, Poston K, et al. A biological definition of neuronal alpha-synuclein disease: towards an integrated staging system for research. Lancet Neurol 2024;23:178–190. - PubMed
    1. Jack CR Jr., Andrews JS, Beach TG, et al. Revised criteria for diagnosis and staging of Alzheimer’s disease: Alzheimer’s Association Workgroup. Alzheimer’s & dementia : the journal of the Alzheimer’s Association 2024;20:5143–5169.
    1. Benatar M, Wuu J, Huey ED, et al. The Miami Framework for ALS and related neurodegenerative disorders: an integrated view of phenotype and biology. Nature reviews Neurology 2024;20:364–376. - PMC - PubMed
    1. Bateman RJ, Benzinger TL, Berry S, et al. The DIAN-TU Next Generation Alzheimer’s prevention trial: Adaptive design and disease progression model. Alzheimer’s & dementia : the journal of the Alzheimer’s Association 2017;13:8–19.
    1. Sperling RA, Rentz DM, Johnson KA, et al. The A4 study: stopping AD before symptoms begin? Sci Transl Med 2014;6:228fs213.

Publication types

LinkOut - more resources