Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2024 Dec;37(6):3231-3249.
doi: 10.1007/s10278-024-01031-y. Epub 2024 Jun 13.

Time-Dependent Deep Learning Prediction of Multiple Sclerosis Disability

Affiliations

Time-Dependent Deep Learning Prediction of Multiple Sclerosis Disability

John D Mayfield et al. J Imaging Inform Med. 2024 Dec.

Abstract

The majority of deep learning models in medical image analysis concentrate on single snapshot timepoint circumstances, such as the identification of current pathology on a given image or volume. This is often in contrast to the diagnostic methodology in radiology where presumed pathologic findings are correlated to prior studies and subsequent changes over time. For multiple sclerosis (MS), the current body of literature describes various forms of lesion segmentation with few studies analyzing disability progression over time. For the purpose of longitudinal time-dependent analysis, we propose a combinatorial analysis of a video vision transformer (ViViT) benchmarked against traditional recurrent neural network of Convolutional Neural Network-Long Short-Term Memory (CNN-LSTM) architectures and a hybrid Vision Transformer-LSTM (ViT-LSTM) to predict long-term disability based upon the Extended Disability Severity Score (EDSS). The patient cohort was procured from a two-site institution with 703 patients' multisequence, contrast-enhanced MRIs of the cervical spine between the years 2002 and 2023. Following a competitive performance analysis, a VGG-16-based CNN-LSTM was compared to ViViT with an ablation analysis to determine time-dependency of the models. The VGG16-LSTM predicted trinary classification of EDSS score in 6 years with 0.74 AUC versus the ViViT with 0.84 AUC (p-value < 0.001 per 5 × 2 cross-validation F-test) on an 80:20 hold-out testing split. However, the VGG16-LSTM outperformed ViViT when patients with only 2 years of MRIs (n = 94) (0.75 AUC versus 0.72 AUC, respectively). Exact EDSS classification was investigated for both models using both classification and regression strategies but showed collectively worse performance. Our experimental results demonstrate the ability of time-dependent deep learning models to predict disability in MS using trinary stratification of disability, mimicking clinical practice. Further work includes external validation and subsequent observational clinical trials.

Keywords: Artificial intelligence; Medical imaging; Multiple sclerosis; Time-dependent deep learning; Video transformers.

PubMed Disclaimer

Conflict of interest statement

Declarations. Ethics Approval: This study was retrospective and did not involve change to treatment plans, additional radiation doses through increased diagnostic examinations, or release of patient information. The USF IRB committee approved a waiver and confirmed no ethical approval would be required. Consent to Participate: Informed consent was waived per IRB as this was a retrospective study. Patients upon receiving any scan within the facility network sign a form stating their deidentified exams may be used for future research and teaching purposes. Consent for Publication: The authors affirm that informed consent was waived per the IRB for publication of the images in Fig. 4. Competing Interests: The authors declare no competing interests.

Figures

Fig. 1
Fig. 1
Flow chart of cohort selection
Fig. 2
Fig. 2
Schematic of cord segmentation and preprocessing to a sequential, multichannel input as a six-frame "video." Initial processing of DICOM images to NIfTI utilizing MATLAB® with subsequent straightening using Spinal Cord Toolbox (https://github.com/spinalcordtoolbox/spinalcordtoolbox) results in a three-channel image containing the T1-weighted, T1 + contrast, and T2-weighted image (top). This is implemented for each year’s MRI for the patient and placed in sequential order (bottom)
Fig. 3
Fig. 3
Competitive analysis of convolutional-based architectures versus transformer-based models. Across five repeated random samplings (RRS), VGG16-LSTM and ViViT consistently had the highest validation and hold-out testing AUC on the entire dataset
Fig. 4
Fig. 4
Examples of misclassification with occlusion maps and classification probability
Fig. 5
Fig. 5
Confusion matrices for trinary classification of hold-out testing on patients with only 2 sequential MRIs
Fig. 6
Fig. 6
Occlusion maps for trinary EDSS prediction between models. A small patch of zeros is passed across the data and logit change is measured. Brighter pixels equate to more important features to the models. Predicted and Truth class labels are provided for each example. VGG = VGG16-LSTM, SVIT = ViViT. 0 = mild, 1 = moderate, 2 = severe disability
Fig. 6
Fig. 6
Occlusion maps for trinary EDSS prediction between models. A small patch of zeros is passed across the data and logit change is measured. Brighter pixels equate to more important features to the models. Predicted and Truth class labels are provided for each example. VGG = VGG16-LSTM, SVIT = ViViT. 0 = mild, 1 = moderate, 2 = severe disability
Fig. 7
Fig. 7
Confusion matrices for hold-out testing on exact EDSS predictions (scale 0–10) for a Classification and b Regression architectures for the VGG16-LSTM (left) and ViViT (right)
Fig. 8
Fig. 8
Occlusion maps for exact EDSS prediction between models in classification. A small patch of zeros is passed across the data and logit change is measured. Brighter pixels equate to more important features to the models. Predicted and Truth class labels are provided for each example. VGG = VGG16-LSTM, SVIT = ViViT
Fig. 8
Fig. 8
Occlusion maps for exact EDSS prediction between models in classification. A small patch of zeros is passed across the data and logit change is measured. Brighter pixels equate to more important features to the models. Predicted and Truth class labels are provided for each example. VGG = VGG16-LSTM, SVIT = ViViT
Fig. 8
Fig. 8
Occlusion maps for exact EDSS prediction between models in classification. A small patch of zeros is passed across the data and logit change is measured. Brighter pixels equate to more important features to the models. Predicted and Truth class labels are provided for each example. VGG = VGG16-LSTM, SVIT = ViViT

Similar articles

References

    1. Rovira, Àlex, and Cristina Auger. “Beyond McDonald: updated perspectives on MRI diagnosis of multiple sclerosis.” Expert Review of Neurotherapeutics 21.8 (2021): 895–911. - PubMed
    1. Wallin MT, Culpepper WJ, Campbell JD, Nelson LM, Langer-Gould A, Marrie RA, et al. The prevalence of MS in the United States. Neurology. 2019 Mar 5; 92(10): e1029–e1040. - PMC - PubMed
    1. Csepany, Tunde. “Diagnosis of multiple sclerosis: A review of the 2017 revisions of the McDonald criteria.” Ideggyogyaszati szemle 71.9–10 (2018): 321–329. - PubMed
    1. Smyrke N, Dunn N, Murley C, Mason D. Standardized mortality ratios in multiple sclerosis: Systematic review with meta‐analysis. Acta Neurologica Scandinavica. 2022 Mar;145(3):360-70. - PubMed
    1. Lycklama, Geert, et al. “Spinal-cord MRI in multiple sclerosis.” The Lancet Neurology 2.9 (2003): 555–562. - PubMed

LinkOut - more resources