This is a preprint.

It has not yet been peer reviewed by a journal.

The National Library of Medicine is running a pilot to include preprints that result from research funded by NIH in PMC and PubMed.

[Preprint]. 2023 Sep 1:2023.06.19.545638.

doi: 10.1101/2023.06.19.545638.

Data-driven biomarkers outperform theory-based biomarkers in predicting stroke motor outcomes

Emily R Olafson¹, Christoph Sperber², Keith W Jamison¹, Mark D Bowren Jr³, Aaron D Boes⁴, Justin W Andrushko⁵, Michael R Borich⁶, Lara A Boyd^{5

7}, Jessica M Cassidy⁸, Adriana B Conforto^{9

10}, Steven C Cramer¹¹, Adrienne N Dula¹², Fatemeh Geranmayeh¹³, Brenton Hordacre¹⁴, Neda Jahanshad¹⁵, Steven A Kautz^{16

17}, Bethany Lo¹⁸, Bradley J MacIntosh^{19

20}, Fabrizio Piras²¹, Andrew D Robertson^{19

22}, Na Jin Seo^{16

17

23}, Surjo R Soekadar²⁴, Sophia I Thomopoulos¹⁵, Daniela Vecchio²¹, Timothy B Weng^{12

25}, Lars T Westlye^{26

27}, Carolee J Winstein^{28

29}, George F Wittenberg^{30

31}, Kristin A Wong³², Paul M Thompson¹⁵, Sook-Lei Liew³³, Amy F Kuceyeski¹

Affiliations

¹ Department of Radiology, Weill Cornell Medicine, New York City, New York, USA.
² Department of Neurology, Inselspital, University Hospital Bern, University of Bern, Bern, Switzerland.
³ Department of Neurology, Carver College of Medicine, Iowa City, IA, USA.
⁴ Departments of Neurology, Psychiatry, and Pediatrics, Carver College of Medicine, Iowa City, IA, USA.
⁵ Department of Physical Therapy, Faculty of Medicine, The University of British Columbia, Vancouver, BC, Canada.
⁶ Division of Physical Therapy, Department of Rehabilitation Medicine, Emory University School of Medicine, Atlanta, GA, USA.
⁷ Djavad Mowafaghian Centre for Brain Health, University of British Columbia, Vancouver, BC, Canada.
⁸ Department of Health Sciences, University of North Carolina at Chapel Hill, Chapel Hill, NC, USA.
⁹ Hospital das Clinicas HCFMUSP, Faculdade de Medicina, Universidade de Sao Paulo, Sao Paolo, Brazil.
¹⁰ Hospital Israelita Albert Einstein, São Paulo, Brazil.
¹¹ Dept. Neurology, UCLA; California Rehabilitation Institute, Los Angeles, CA, USA.
¹² Department of Neurology, Dell Medical School at The University of Texas Austin, Austin, TX, USA.
¹³ Clinical Language and Cognition Group. Department of Brain Sciences, Imperial College London, London, United Kingdom.
¹⁴ Innovation, Implementation and Clinical Translation (IIMPACT) in Health, Allied Health and Human Performance, University of South Australia, Adelaide, Australia.
¹⁵ Imaging Genetics Center, Mark and Mary Stevens Neuroimaging and Informatics Institute, Keck School of Medicine, University of Southern California, Charleston, SC, USA.
¹⁶ Department of Health Sciences & Research, Medical University of South Carolina, Charleston, SC, USA.
¹⁷ Ralph H Johnson VA Health Care System, Charleston, SC, USA.
¹⁸ Chan Division of Occupational Science and Occupational Therapy, University of Southern California, Los Angeles, CA, USA.
¹⁹ Sandra Black Centre for Brain Resilience and Recovery, Hurvitz Brain Sciences Program, Sunnybrook Research Institute, Toronto, ON, Canada.
²⁰ Computational Radiology and Artificial Intelligence (CRAI), Department of Physics and Computational Radiology, Clinic for Radiology and Nuclear Medicine, Oslo University Hospital, Oslo, Norway.
²¹ Laboratory of Neuropsychiatry, Santa Lucia Foundation IRCCS, Rome, Italy.
²² Schlegel-UW Research Institute for Aging, Waterloo, ON, Canada.
²³ Department of Rehabilitation Sciences, Medical University of South Carolina, Charleston, SC, USA.
²⁴ Dept. of Psychiatry and Neurosciences, Charité Campus Mitte (CCM), Charité - Universitätsmedizin Berlin, Berlin, Germany.
²⁵ Department of Diagnostic Medicine, Dell Medical School, The University of Texas at Austin, Austin, TX, USA.
²⁶ Department of Psychology, University of Oslo, Oslo, Norway.
²⁷ NORMENT, Division of Mental Health and Addiction, Oslo University Hospital, Oslo, Norway.
²⁸ Division of Biokinesiology and Physical Therapy, Herman Ostrow School of Dentistry, University of Southern California, Los Angeles, CA, USA.
²⁹ Department of Neurology, Keck School of Medicine, University of Southern California, Los Angeles, CA, USA.
³⁰ Departments of Neurology, Bioengineering, Physical Medicine & Rehabilitation, University of Pittsburgh, Pittsburgh, PA, USA.
³¹ GRECC, HERL, Department of Veterans Affairs Pittsburgh Healthcare System, Pittsburgh, PA, USA.
³² Department of Physical Medicine & Rehabilitation, Dell Medical School, University of Texas at Austin, Austin, TX, USA.
³³ Stevens Neuroimaging and Informatics Institute, University of Southern California, Los Angeles, CA, USA.

PMID: 37693419
PMCID: PMC10491132
DOI: 10.1101/2023.06.19.545638

Data-driven biomarkers outperform theory-based biomarkers in predicting stroke motor outcomes

Emily R Olafson et al. bioRxiv. 2023.

[Preprint]. 2023 Sep 1:2023.06.19.545638.

doi: 10.1101/2023.06.19.545638.

Authors

Affiliations

¹ Department of Radiology, Weill Cornell Medicine, New York City, New York, USA.
² Department of Neurology, Inselspital, University Hospital Bern, University of Bern, Bern, Switzerland.
³ Department of Neurology, Carver College of Medicine, Iowa City, IA, USA.
⁴ Departments of Neurology, Psychiatry, and Pediatrics, Carver College of Medicine, Iowa City, IA, USA.
⁵ Department of Physical Therapy, Faculty of Medicine, The University of British Columbia, Vancouver, BC, Canada.
⁶ Division of Physical Therapy, Department of Rehabilitation Medicine, Emory University School of Medicine, Atlanta, GA, USA.
⁷ Djavad Mowafaghian Centre for Brain Health, University of British Columbia, Vancouver, BC, Canada.
⁸ Department of Health Sciences, University of North Carolina at Chapel Hill, Chapel Hill, NC, USA.
⁹ Hospital das Clinicas HCFMUSP, Faculdade de Medicina, Universidade de Sao Paulo, Sao Paolo, Brazil.
¹⁰ Hospital Israelita Albert Einstein, São Paulo, Brazil.
¹¹ Dept. Neurology, UCLA; California Rehabilitation Institute, Los Angeles, CA, USA.
¹² Department of Neurology, Dell Medical School at The University of Texas Austin, Austin, TX, USA.
¹³ Clinical Language and Cognition Group. Department of Brain Sciences, Imperial College London, London, United Kingdom.
¹⁴ Innovation, Implementation and Clinical Translation (IIMPACT) in Health, Allied Health and Human Performance, University of South Australia, Adelaide, Australia.
¹⁵ Imaging Genetics Center, Mark and Mary Stevens Neuroimaging and Informatics Institute, Keck School of Medicine, University of Southern California, Charleston, SC, USA.
¹⁶ Department of Health Sciences & Research, Medical University of South Carolina, Charleston, SC, USA.
¹⁷ Ralph H Johnson VA Health Care System, Charleston, SC, USA.
¹⁸ Chan Division of Occupational Science and Occupational Therapy, University of Southern California, Los Angeles, CA, USA.
¹⁹ Sandra Black Centre for Brain Resilience and Recovery, Hurvitz Brain Sciences Program, Sunnybrook Research Institute, Toronto, ON, Canada.
²⁰ Computational Radiology and Artificial Intelligence (CRAI), Department of Physics and Computational Radiology, Clinic for Radiology and Nuclear Medicine, Oslo University Hospital, Oslo, Norway.
²¹ Laboratory of Neuropsychiatry, Santa Lucia Foundation IRCCS, Rome, Italy.
²² Schlegel-UW Research Institute for Aging, Waterloo, ON, Canada.
²³ Department of Rehabilitation Sciences, Medical University of South Carolina, Charleston, SC, USA.
²⁴ Dept. of Psychiatry and Neurosciences, Charité Campus Mitte (CCM), Charité - Universitätsmedizin Berlin, Berlin, Germany.
²⁵ Department of Diagnostic Medicine, Dell Medical School, The University of Texas at Austin, Austin, TX, USA.
²⁶ Department of Psychology, University of Oslo, Oslo, Norway.
²⁷ NORMENT, Division of Mental Health and Addiction, Oslo University Hospital, Oslo, Norway.
²⁸ Division of Biokinesiology and Physical Therapy, Herman Ostrow School of Dentistry, University of Southern California, Los Angeles, CA, USA.
²⁹ Department of Neurology, Keck School of Medicine, University of Southern California, Los Angeles, CA, USA.
³⁰ Departments of Neurology, Bioengineering, Physical Medicine & Rehabilitation, University of Pittsburgh, Pittsburgh, PA, USA.
³¹ GRECC, HERL, Department of Veterans Affairs Pittsburgh Healthcare System, Pittsburgh, PA, USA.
³² Department of Physical Medicine & Rehabilitation, Dell Medical School, University of Texas at Austin, Austin, TX, USA.
³³ Stevens Neuroimaging and Informatics Institute, University of Southern California, Los Angeles, CA, USA.

PMID: 37693419
PMCID: PMC10491132
DOI: 10.1101/2023.06.19.545638

Update in

Data-driven biomarkers better associate with stroke motor outcomes than theory-based biomarkers.
Olafson ER, Sperber C, Jamison KW, Bowren MD Jr, Boes AD, Andrushko JW, Borich MR, Boyd LA, Cassidy JM, Conforto AB, Cramer SC, Dula AN, Geranmayeh F, Hordacre B, Jahanshad N, Kautz SA, Tavenner BP, MacIntosh BJ, Piras F, Robertson AD, Seo NJ, Soekadar SR, Thomopoulos SI, Vecchio D, Weng TB, Westlye LT, Winstein CJ, Wittenberg GF, Wong KA, Thompson PM, Liew SL, Kuceyeski AF. Olafson ER, et al. Brain Commun. 2024 Jul 31;6(4):fcae254. doi: 10.1093/braincomms/fcae254. eCollection 2024. Brain Commun. 2024. PMID: 39171205 Free PMC article.

Abstract

Chronic motor impairments are a leading cause of disability after stroke. Previous studies have predicted motor outcomes based on the degree of damage to predefined structures in the motor system, such as the corticospinal tract. However, such theory-based approaches may not take full advantage of the information contained in clinical imaging data. The present study uses data-driven approaches to predict chronic motor outcomes after stroke and compares the accuracy of these predictions to previously-identified theory-based biomarkers. Using a cross-validation framework, regression models were trained using lesion masks and motor outcomes data from 789 stroke patients (293 female/496 male) from the ENIGMA Stroke Recovery Working Group (age 64.9±18.0 years; time since stroke 12.2±0.2 months; normalised motor score 0.7±0.5 (range [0,1]). The out-of-sample prediction accuracy of two theory-based biomarkers was assessed: lesion load of the corticospinal tract, and lesion load of multiple descending motor tracts. These theory-based prediction accuracies were compared to the prediction accuracy from three data-driven biomarkers: lesion load of lesion-behaviour maps, lesion load of structural networks associated with lesion-behaviour maps, and measures of regional structural disconnection. In general, data-driven biomarkers had better prediction accuracy - as measured by higher explained variance in chronic motor outcomes - than theory-based biomarkers. Data-driven models of regional structural disconnection performed the best of all models tested (R² = 0.210, p < 0.001), performing significantly better than predictions using the theory-based biomarkers of lesion load of the corticospinal tract (R² = 0.132, p< 0.001) and of multiple descending motor tracts (R² = 0.180, p < 0.001). They also performed slightly, but significantly, better than other data-driven biomarkers including lesion load of lesion-behaviour maps (R² =0.200, p < 0.001) and lesion load of structural networks associated with lesion-behaviour maps (R² =0.167, p < 0.001). Ensemble models - combining basic demographic variables like age, sex, and time since stroke - improved prediction accuracy for theory-based and data-driven biomarkers. Finally, combining both theory-based and data-driven biomarkers with demographic variables improved predictions, and the best ensemble model achieved R² = 0.241, p < 0.001. Overall, these results demonstrate that models that predict chronic motor outcomes using data-driven features, particularly when lesion data is represented in terms of structural disconnection, perform better than models that predict chronic motor outcomes using theory-based features from the motor system. However, combining both theory-based and data-driven models provides the best predictions.

Keywords: imaging biomarkers; lesion-deficit inference; machine learning.

PubMed Disclaimer

Conflict of interest statement

Competing Interests S.C.C. serves as a consultant for Abbvie, Constant Therapeutics, BrainQ, Myomo, MicroTransponder, Neurolutions, Panaxium, NeuExcell, Elevian, Helius, Omniscient, Brainsgate, Nervgen, Battelle, and TRCare. B.H. has a clinical partnership with Fourier Intelligence. N.J.S. is an inventor for a patent US 10,071,015 B2. C.J. W. is a consultant for Microtransponder, BrainQ, and MedRhythm. G.F.W. sits on Advisory Boards for Myomo and Neuro-innovators.

Figures

**Figure 1.. Cross-validation framework.**
A. Overview of 5-fold cross-validation. Subject data is partitioned into five non-overlapping training and test folds, such that no training subjects are in the test set, and no subject is in the test fold more than once. B. Use of acute/subacute subjects in training folds but not test folds. When using all training data, chronic subjects were included in the test folds and training folds, whereas acute/subacute stroke subjects were only included in training folds.

**Figure 2.. Theory-based biomarkers.**
A. The M1-CST, here displaying only right hemisphere tracts relative to an MNI template. B. Tracts from the sensorimotor tract template atlas (SMATT), displaying only right hemisphere tracts relative to an MNI template, including pre-supplementary motor area (pre-SMA), supplementary motor area (SMA), dorsal premotor cortex (PMd), ventral premotor cortex (PMv), primary motor cortex (M1), and primary sensory cortex (S1). Pre-SMA is the most anterior tract, S1 is the most posterior tract.

**Figure 3.. Data-driven biomarkers.**
A. Lesion-behaviour map (LBM) representing the association between voxelwise damage and Fugl-Meyer scores, derived from multivariate lesion-behaviour mapping with Fugl-Meyer scores. B. Structural lesion network maps (sLNMs), derived from seed-based tractography run on peak regions identified from LBM (A) and then performing principal components analysis to identify 3 components, split into positive and negative weights. C. Change in Connectivity (ChaCo) scores derived from the Network Modification (NeMo) tool. Binary lesion masks in MNI space representing the presence of a stroke lesion (turquoise) in a given voxel are provided by the user. Each lesion mask is embedded into 420 unrelated healthy structural connectomes (separately for each healthy subject) and the regional ChaCo scores are calculated and averaged across healthy subjects (parcellation shown here is the Shen 268-region atlas).

**Figure 4.. Summary of model performance metrics across all models tested and feature weights (regression coefficients β) for the two best-performing models.**
A. and B. Distribution of model performance (mean Pearson correlation/R² across 5 outer folds for 100 permutations of the data). Asterisks (*) indicate that model performance is significantly above chance (*, p < 0.001), as assessed via permutation testing. The boxes extend from the lower to upper quartile values of the data, with a line at the median. Whiskers represent the range of the data from [Q1–1.5*IQR, Q3+1.5*IQR]. C. and D. Mean feature weights for the top two best-performing models (ChaCo (fs86) without feature selection, ChaCo (shen268) with feature selection, respectively). For the fs86-ChaCo model (left), we display the mean regression coefficients β across 100 permutations. For ChaCo (shen268) (right), we display the median regression coefficients of regions that were selected in at least 95% of outer folds (i.e., for regions that were included in the model in at least 475/500 outer folds, mean β coefficients were calculated across 5 outer folds, and the median value across 100 permutations is plotted).

**Figure 5.. Statistical comparison of model performance for predicting motor scores using Mann-Whitney signed-rank tests.**
Colours shown indicate the differences in median explained variance scores for each model. A. Models trained using all (acute and chronic) training data. B. Models trained only using chronic data. *** denotes corrected p < 0.001 after Bonferroni correction. A positive difference indicates that the model on the y-axis (vertical) has a greater explained variance than the model on the x-axis (horizontal).

**Figure 6.. Statistical comparison of model performance for ensemble models.**
Demog. = demographic information (age, sex, days since stroke). ChaCo = model using 268-region ChaCo scores w/ feature selection. Significance of differences in explained variance were evaluated using Mann-Whitney signed-rank tests; ***denotes corrected p < 0.001 after Bonferroni correction. A positive difference value indicates that the model on the y-axis (vertical) has a greater explained variance than the model on the x-axis (horizontal).

**Figure 7.. Analysis of feature stability for 268-region ChaCo models (with feature selection) and investigation of paradoxical feature weights.**
A. Correlation between beta coefficients across five training folds for one permutation. Each point corresponds to one region, and points are coloured by the mean beta coefficient for that region across 500 training folds (i.e. coloured based on y-axis value). B. Boxplots show the distribution of beta coefficients of consistently-weighted regions (defined as having median beta coefficients that are zero or of an opposite sign <5% of the time). In total, 30 regions with consistent negative weights and 5 regions with consistent positive weights remained. Median weights for consistently-weighted regions are plotted on a brain. The boxes extend from the lower to upper quartile values of the data, with a line at the median. Whiskers represent the range of the data from [Q1–1.5*IQR, Q3+1.5*IQR].

See this image and copyright information in PMC

References

1. Kelly-Hayes M, Beiser A, Kase CS, Scaramucci A, D’Agostino RB, Wolf PA. The influence of gender and age on disability following ischemic stroke: the Framingham study. J Stroke Cerebrovasc Dis. 2003;12(3):119–126. - PubMed
1. Bonkhoff AK, Grefkes C. Precision medicine in stroke: towards personalized outcome predictions using artificial intelligence. Brain. 2022;145(2):457–475. - PMC - PubMed
1. Boyd LA, Hayward KS, Ward NS, et al. Biomarkers of stroke recovery: Consensus-based core recommendations from the Stroke Recovery and Rehabilitation Roundtable. Int J Stroke. 2017;12(5):480–493. - PMC - PubMed
1. Tozlu C, Edwards D, Boes A, et al. Machine Learning Methods Predict Individual Upper-Limb Motor Impairment Following Therapy in Chronic Stroke. Neurorehabil Neural Repair. 2020;34(5):428–439. - PMC - PubMed
1. Kuceyeski A, Navi BB, Kamel H, et al. Structural connectome disruption at baseline predicts 6-months post-stroke outcome. Hum Brain Mapp. 2016;37(7):2587–2601. - PMC - PubMed

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

This is a preprint.

Data-driven biomarkers outperform theory-based biomarkers in predicting stroke motor outcomes

Affiliations

Data-driven biomarkers outperform theory-based biomarkers in predicting stroke motor outcomes

Authors

Affiliations

Update in

Abstract

Conflict of interest statement

Figures

References

Publication types

Grants and funding

LinkOut - more resources

Full Text Sources

Miscellaneous