Performance and Information Leakage in Splitfed Learning and Multi-Head Split Learning in Healthcare Data and Beyond
- PMID: 35893586
- PMCID: PMC9326525
- DOI: 10.3390/mps5040060
Performance and Information Leakage in Splitfed Learning and Multi-Head Split Learning in Healthcare Data and Beyond
Abstract
Machine learning (ML) in healthcare data analytics is attracting much attention because of the unprecedented power of ML to extract knowledge that improves the decision-making process. At the same time, laws and ethics codes drafted by countries to govern healthcare data are becoming stringent. Although healthcare practitioners are struggling with an enforced governance framework, we see the emergence of distributed learning-based frameworks disrupting traditional-ML-model development. Splitfed learning (SFL) is one of the recent developments in distributed machine learning that empowers healthcare practitioners to preserve the privacy of input data and enables them to train ML models. However, SFL has some extra communication and computation overheads at the client side due to the requirement of client-side model synchronization. For a resource-constrained client side (hospitals with limited computational powers), removing such conditions is required to gain efficiency in the learning. In this regard, this paper studies SFL without client-side model synchronization. The resulting architecture is known as multi-head split learning (MHSL). At the same time, it is important to investigate information leakage, which indicates how much information is gained by the server related to the raw data directly out of the smashed data-the output of the client-side model portion-passed to it by the client. Our empirical studies examine the Resnet-18 and Conv1-D architecture model on the ECG and HAM-10000 datasets under IID data distribution. The results find that SFL provides 1.81% and 2.36% better accuracy than MHSL on the ECG and HAM-10000 datasets, respectively (for cut-layer value set to 1). Analysis of experimentation with various client-side model portions demonstrates that it has an impact on the overall performance. With an increase in layers in the client-side model portion, SFL performance improves while MHSL performance degrades. Experiment results also demonstrate that information leakage provided by mutual information score values in SFL is more than MHSL for ECG and HAM-10000 datasets by 2×10-5 and 4×10-3, respectively.
Keywords: distributed collaborative machine learning; information leakage in distributed learning; multi-head split learning; parameter transmission-based distributed machine learning; privacy-preserving machine learning; split learning.
Conflict of interest statement
The authors declare no conflict of interest.
Figures





Similar articles
-
Decentralised, collaborative, and privacy-preserving machine learning for multi-hospital data.EBioMedicine. 2024 Mar;101:105006. doi: 10.1016/j.ebiom.2024.105006. Epub 2024 Feb 19. EBioMedicine. 2024. PMID: 38377795 Free PMC article.
-
GAN-based data reconstruction attacks in split learning.Neural Netw. 2025 May;185:107150. doi: 10.1016/j.neunet.2025.107150. Epub 2025 Jan 16. Neural Netw. 2025. PMID: 39827841
-
Dynamic Corrected Split Federated Learning With Homomorphic Encryption for U-Shaped Medical Image Networks.IEEE J Biomed Health Inform. 2023 Dec;27(12):5946-5957. doi: 10.1109/JBHI.2023.3317632. Epub 2023 Dec 5. IEEE J Biomed Health Inform. 2023. PMID: 37729562
-
A Comprehensive Survey on Local Differential Privacy toward Data Statistics and Analysis.Sensors (Basel). 2020 Dec 8;20(24):7030. doi: 10.3390/s20247030. Sensors (Basel). 2020. PMID: 33302517 Free PMC article. Review.
-
Distributed learning: a reliable privacy-preserving strategy to change multicenter collaborations using AI.Eur J Nucl Med Mol Imaging. 2021 Nov;48(12):3791-3804. doi: 10.1007/s00259-021-05339-7. Epub 2021 Apr 13. Eur J Nucl Med Mol Imaging. 2021. PMID: 33847779 Free PMC article. Review.
Cited by
-
Enhancing Alzheimer's disease classification through split federated learning and GANs for imbalanced datasets.PeerJ Comput Sci. 2024 Nov 29;10:e2459. doi: 10.7717/peerj-cs.2459. eCollection 2024. PeerJ Comput Sci. 2024. PMID: 39650412 Free PMC article.
-
Privacy-preserving decentralized learning methods for biomedical applications.Comput Struct Biotechnol J. 2024 Aug 30;23:3281-3287. doi: 10.1016/j.csbj.2024.08.024. eCollection 2024 Dec. Comput Struct Biotechnol J. 2024. PMID: 39296807 Free PMC article. Review.
References
-
- Awotunde J.B., Adeniyi A.E., Ogundokun R.O., Ajamu G.J., Adebayo P.O. MIoT-based big data analytics architecture, opportunities and challenges for enhanced telemedicine systems. Enhanc. Telemed. e-Health. 2021;410:199–220.
-
- Azzi A. The challenges faced by the extraterritorial scope of the General Data Protection Regulation. J. Intell. Prop. Info. Tech. Elec. Com. L. 2018;9:126.
-
- Qi A., Shao G., Zheng W. Assessing China’s cybersecurity law. Comput. Law Secur. Rev. 2018;34:1342–1354. doi: 10.1016/j.clsr.2018.08.007. - DOI
-
- Konečnỳ J., McMahan H.B., Yu F.X., Richtárik P., Suresh A.T., Bacon D. Federated learning: Strategies for improving communication efficiency. arXiv. 20161610.05492
Grants and funding
LinkOut - more resources
Full Text Sources