Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2025 Jul 24;8(1):478.
doi: 10.1038/s41746-025-01788-8.

Personalized federated learning for predicting disability progression in multiple sclerosis using real-world routine clinical data

Ashkan Pirmani  1   2   3   4 Edward De Brouwer  1 Ádám Arany  1 Martijn Oldenhof  1 Antoine Passemiers  1 Axel Faes  2   3   4 Tomas Kalincik  5 Serkan Ozakbas  6 Riadh Gouider  7 Barbara Willekens  8   9 Dana Horakova  10 Eva Kubala Havrdova  10 Francesco Patti  11 Alexandre Prat  12 Alessandra Lugaresi  13 Valentina Tomassini  14 Pierre Grammond  15 Elisabetta Cartechini  16 Izanne Roos  5 Cavit Boz  17 Raed Alroughani  18 Maria Pia Amato  19   20 Katherine Buzzard  21 Jeannette Lechner-Scott  22 Joana Guimarães  23   24 Claudio Solaro  25 Oliver Gerlach  26 Aysun Soysal  27 Jens Kuhle  28 Jose Luis Sanchez-Menoyo  29 Daniele Spitaleri  30 Tunde Csepany  31 Bart Van Wijmeersch  32 Radek Ampapa  33 Julie Prevost  34 Samia J Khoury  35 Vincent Van Pesch  36 Nevin John  37 Davide Maimone  38 Bianca Weinstock-Guttman  39 Guy Laureys  40 Pamela McCombe  41 Yolanda Blanco  42 Ayse Altintas  43 Abdullah Al-Asmi  44 Justin Garber  45 Anneke Van der Walt  46 Helmut Butzkueven  46 Koen de Gans  47 Csilla Rozsa  48 Bruce Taylor  49 Talal Al-Harbi  50 Attila Sas  51 Cecilia Rajda  52 Orla Gray  53 Danny Decoo  54 William M Carroll  55 Allan G Kermode  56 Marzena Fabis-Pedrini  57 Deborah Mason  58 Angel Perez-Sempere  59 Mihaela Simu  60 Neil Shuey  61 Bhim Singhal  62 Marija Cauchi  63 Todd A Hardy  64 Sudarshini Ramanathan  65 Patrice Lalive  66 Carmen-Adella Sirbu  67 Stella Hughes  68 Tamara Castillo Trivino  69 Liesbet M Peeters  2   3   4 Yves Moreau  70
Affiliations

Personalized federated learning for predicting disability progression in multiple sclerosis using real-world routine clinical data

Ashkan Pirmani et al. NPJ Digit Med. .

Abstract

Early prediction of disability progression in multiple sclerosis (MS) remains challenging despite its critical importance for therapeutic decision-making. We present the first systematic evaluation of personalized federated learning (PFL) for 2-year MS disability progression prediction, leveraging multi-center real-world data from over 26,000 patients. While conventional federated learning (FL) enables privacy-aware collaborative modeling, it remains vulnerable to institutional data heterogeneity. PFL overcomes this challenge by adapting shared models to local data distributions without compromising privacy. We evaluated two personalization strategies: a novel AdaptiveDualBranchNet architecture with selective parameter sharing, and personalized fine-tuning of global models, benchmarked against centralized and client-specific approaches. Baseline FL underperformed relative to personalized methods, whereas personalization significantly improved performance, with personalized FedProx and FedAVG achieving ROC-AUC scores of 0.8398 ± 0.0019 and 0.8384 ± 0.0014, respectively. These findings establish personalization as critical for scalable, privacy-aware clinical prediction models and highlight its potential to inform earlier intervention strategies in MS and beyond.

PubMed Disclaimer

Conflict of interest statement

Competing interests: The authors declare no competing interests.

Figures

Fig. 1
Fig. 1. Analysis paradigms for predicting disability progression in multiple sclerosis using real world data.
a Baseline Federated Learning (FL): Depicts the classic iterative process in which multiple clients (e.g., clinical sites) collaboratively train a single global model. Each client gets the current global model (Step 1) and trains it locally on its private dataset (Step 2). The locally updated parameters are then uploaded to a central server (Steps 3) and aggregated (Step 4), refining the global model without exchanging any raw patient data. (Step 5) Dissemination of the updated global model to all clients for continued local training based on aggregated knowledge. b Personalized Federated Learning (PFL) with Adaptive Partial Parameter Exchanges (AdaptiveDualBranchNet): Illustrates a dual branch architecture where each client’s model is split into a shared core (federated across all clients) and local extension layers (trained solely with private data). During each federated round (Steps 1, 3, 5), only the shared core parameters are exchanged and aggregated at the central server, preserving common knowledge. The local extension layers remain entirely onsite (Step 2), allowing each client to further personalize its model based on unique data distributions or sample sizes. c PFL via Fine Tuning: Shows how a pre-trained global FL model (Step 1) is shared with each client. Each client fine tunes this model on its local dataset (Step 2), creating a personalized version (Step 3) that reflects client specific characteristics. This approach retains the benefits of cross site collaboration while allowing for tailored predictions. Collectively, these paradigms form part of a broader analysis that also includes centralized (pooled data) and local (client specific) training baselines. This holistic evaluation framework helps elucidate the strengths and trade offs of each approach in leveraging real world data for predicting disability progression in multiple sclerosis.
Fig. 2
Fig. 2. Comparative analysis of federated learning model performance.
Evaluating ROC–AUC and AUC–PR Metrics Across Different Strategies. (Left) Receiver Operating Characteristic: The centralized model achieves the highest performance with an ROC–AUC = 0.8092 ± 0.0012, demonstrating the advantage of having a pooled dataset. Among FL strategies, FedAdam and FedYogi perform best, with ROC–AUC values of 0.7920 ± 0.0031 and 0.7910 ± 0.0028, respectively. The other FL methods, including FedAvg and FedProx, show slightly lower performance, underscoring the challenges of a global federated model in heterogeneous data settings. (Right) Precision-Recall Curve: Again, the centralized model outperforms with an AUC–PR = 0.4605 ± 0.0043. Among FL methods, FedAdam achieves the highest AUC–PR of 0.4488 ± 0.0061, while FedYogi and FedProx follow closely with values of 0.4420 ± 0.0078 and 0.4081 ± 0.0058, respectively. The drop in performance compared to the centralized approach reflects the difficulty of capturing minority class predictions in federated settings. These results emphasize the performance gap between centralized and federated learning strategies, particularly in heterogeneous and imbalanced data scenarios.
Fig. 3
Fig. 3. Comparison of ROC–AUC and AUC–PR differences among adaptive, fine-tuned, and federated models.
This figure shows average ROC–AUC and AUC–PR score differences across five strategies (FedAVG, FedProx, FedAdam, FedYogi, and FedAdagrad) for each pairwise comparison of model types. Error bars represent 95% confidence intervals from multiple runs, with a dashed vertical line at zero indicating no difference between models.
Fig. 4
Fig. 4. Heterogeneity of country-specific data partitions for federated learning.
a Log-scaled distribution of country-specific dataset sizes DCi, sorted in descending order, highlighting disparities in data contributions. b Class imbalance across countries, showing underrepresentation of Class 1 (MS worsening confirmed) relative to Class 0. c Histogram of dataset sizes using 5K bin intervals, emphasizing skewed availability across participating centers. d Pie chart illustrating the proportional contribution of each country to the overall dataset. Together, these analyses demonstrate the significant variability in both data quantity and label distributions across clients, underscoring the challenges faced by federated learning models operating in real-world clinical settings.
Fig. 5
Fig. 5. The diagram depicts the structure of the Baseline and AdaptiveDualBranchNet models.
a The Baseline network features a standard feedforward architecture. It begins with an Input Layer, which feeds into a series of Hidden Layers. Each hidden layer comprises neurons arranged in a fully connected structure, with arrows indicating the flow of information from one layer to the next. The connections show that each neuron in one layer is connected to every neuron in the subsequent layer, enabling the network to capture complex relationships between inputs. The network’s final layer is the Output Layer, which aggregates the learned features from the hidden layers to produce the output Y. The straightforward structure of this network is designed for general-purpose learning tasks without additional branching or specialized layers. b The AdaptiveDualBranchNet architecture extends the Baseline by introducing a dual-branch structure comprising Core Layers and Extension Layers. The Core Layers, highlighted in yellow, retain the fully connected structure of the Baseline’s Hidden Layers and are shared across all clients, being trained in a FL setup to capture fundamental and generalizable features from the data. In contrast, the Extension Layers, shown in orange, are client-specific and designed to learn personalized representations. These layers receive input from the same Input Layer as the Core Layers but follow a distinct structural design tailored to capture additional, domain- or client-specific variations in the data. Unlike the Core Layers, which are updated through FL aggregation, the Extension Layers remain locally trained, enabling each client to adapt the model to its unique distribution while benefiting from the shared knowledge encoded in the Core Layers. At the final stage, both branches feed into a set of processing nodes (depicted as c -units in red), which consolidate the learned representations before reaching the Output Layer Y. This separation between federated (global) and local (personalized) training allows the AdaptiveDualBranchNet to balance generalization and personalization, making it particularly effective in heterogeneous data environments where both shared knowledge and client-specific adaptations are necessary.

References

    1. Walton, C. et al. Rising prevalence of multiple sclerosis worldwide: insights from the atlas of MS, third edition. Mult. Scler. J.26, 1816–1821 (2020). - PMC - PubMed
    1. McGinley, M. P., Goldschmidt, C. H. & Rae-Grant, A. D. Diagnosis and treatment of multiple sclerosis: a review. JAMA325, 765–779 (2021). - PubMed
    1. Reich, D. S., Lucchinetti, C. F. & Calabresi, P. A. Multiple sclerosis. N. Engl. J. Med.378, 169–180 (2018). - PMC - PubMed
    1. Degenhardt, A., Ramagopalan, S. V., Scalfari, A. & Ebers, G. C. Clinical prognostic factors in multiple sclerosis: a natural history review. Nat. Rev. Neurol.5, 672–682 (2009). - PubMed
    1. Pellegrini, F. et al. Predicting disability progression in multiple sclerosis: insights from advanced statistical modeling. Mult. Scler. J.26, 1828–1836 (2020). - PubMed

LinkOut - more resources