Data encoding for healthcare data democratization and information leakage prevention
- PMID: 38383571
- PMCID: PMC10882022
- DOI: 10.1038/s41467-024-45777-z
Data encoding for healthcare data democratization and information leakage prevention
Abstract
The lack of data democratization and information leakage from trained models hinder the development and acceptance of robust deep learning-based healthcare solutions. This paper argues that irreversible data encoding can provide an effective solution to achieve data democratization without violating the privacy constraints imposed on healthcare data and clinical models. An ideal encoding framework transforms the data into a new space where it is imperceptible to a manual or computational inspection. However, encoded data should preserve the semantics of the original data such that deep learning models can be trained effectively. This paper hypothesizes the characteristics of the desired encoding framework and then exploits random projections and random quantum encoding to realize this framework for dense and longitudinal or time-series data. Experimental evaluation highlights that models trained on encoded time-series data effectively uphold the information bottleneck principle and hence, exhibit lesser information leakage from trained models.
© 2024. The Author(s).
Conflict of interest statement
The authors declare no competing interests.
Figures








Similar articles
-
Engaging Through Awareness: Purpose-Driven Framework Development to Evaluate and Develop Future Business Strategies With Exponential Technologies Toward Healthcare Democratization.Front Public Health. 2022 May 25;10:851380. doi: 10.3389/fpubh.2022.851380. eCollection 2022. Front Public Health. 2022. PMID: 35692334 Free PMC article.
-
Domain-informed variational neural networks and support vector machines based leakage detection framework to augment self-healing in water distribution networks.Water Res. 2024 Feb 1;249:120983. doi: 10.1016/j.watres.2023.120983. Epub 2023 Dec 6. Water Res. 2024. PMID: 38118223
-
Performance and Information Leakage in Splitfed Learning and Multi-Head Split Learning in Healthcare Data and Beyond.Methods Protoc. 2022 Jul 13;5(4):60. doi: 10.3390/mps5040060. Methods Protoc. 2022. PMID: 35893586 Free PMC article.
-
Analysis of Application Examples of Differential Privacy in Deep Learning.Comput Intell Neurosci. 2021 Oct 26;2021:4244040. doi: 10.1155/2021/4244040. eCollection 2021. Comput Intell Neurosci. 2021. PMID: 34745246 Free PMC article. Review.
-
American Medical Society for Sports Medicine position statement: concussion in sport.Br J Sports Med. 2013 Jan;47(1):15-26. doi: 10.1136/bjsports-2012-091941. Br J Sports Med. 2013. PMID: 23243113 Review.
Cited by
-
Identifying and Predicting Cognitive Decline Using Multi-Modal Sensor Data and Machine Learning Approach.Res Sq [Preprint]. 2025 Jun 18:rs.3.rs-6735622. doi: 10.21203/rs.3.rs-6735622/v1. Res Sq. 2025. PMID: 40585273 Free PMC article. Preprint.
References
-
- Goodfellow, I., Bengio, Y. & Courville, A. Deep learning. http://www.deeplearningbook.org (MIT Press, 2016).
LinkOut - more resources
Full Text Sources