Privacy-preserving logistic regression with secret sharing
- PMID: 35366870
- PMCID: PMC8977014
- DOI: 10.1186/s12911-022-01811-y
Privacy-preserving logistic regression with secret sharing
Abstract
Background: Logistic regression (LR) is a widely used classification method for modeling binary outcomes in many medical data classification tasks. Researchers that collect and combine datasets from various data custodians and jurisdictions can greatly benefit from the increased statistical power to support their analysis goals. However, combining data from different sources creates serious privacy concerns that need to be addressed.
Methods: In this paper, we propose two privacy-preserving protocols for performing logistic regression with the Newton-Raphson method in the estimation of parameters. Our proposals are based on secure Multi-Party Computation (MPC) and tailored to the honest majority and dishonest majority security settings.
Results: The proposed protocols are evaluated against both synthetic and real-world datasets in terms of efficiency and accuracy, and a comparison is made with the ordinary logistic regression. The experimental results demonstrate that the proposed protocols are highly efficient and accurate.
Conclusions: Our work introduces two iterative algorithms to enable the distributed training of a logistic regression model in a privacy-preserving manner. The implementation results show that our algorithms can handle large datasets from multiple sources.
Keywords: Logistic regression; Multi-party computation; Newton–Raphson; Privacy-preserving; Secret sharing.
© 2022. The Author(s).
Conflict of interest statement
The authors declare that they have no competing interests.
Figures




Similar articles
-
High performance logistic regression for privacy-preserving genome analysis.BMC Med Genomics. 2021 Jan 20;14(1):23. doi: 10.1186/s12920-020-00869-9. BMC Med Genomics. 2021. PMID: 33472626 Free PMC article.
-
Efficient Privacy-Preserving K-Means Clustering from Secret-Sharing-Based Secure Three-Party Computation.Entropy (Basel). 2022 Aug 18;24(8):1145. doi: 10.3390/e24081145. Entropy (Basel). 2022. PMID: 36010809 Free PMC article.
-
Accurate training of the Cox proportional hazards model on vertically-partitioned data while preserving privacy.BMC Med Inform Decis Mak. 2022 Feb 24;22(1):49. doi: 10.1186/s12911-022-01771-3. BMC Med Inform Decis Mak. 2022. PMID: 35209883 Free PMC article.
-
Methods of privacy-preserving genomic sequencing data alignments.Brief Bioinform. 2021 Nov 5;22(6):bbab151. doi: 10.1093/bib/bbab151. Brief Bioinform. 2021. PMID: 34021302 Review.
-
A survey on genomic data by privacy-preserving techniques perspective.Comput Biol Chem. 2021 Aug;93:107538. doi: 10.1016/j.compbiolchem.2021.107538. Epub 2021 Jun 29. Comput Biol Chem. 2021. PMID: 34246892 Review.
Cited by
-
21st century (clinical) decision support in nursing and allied healthcare. Developing a learning health system: a reasoned design of a theoretical framework.BMC Med Inform Decis Mak. 2023 Dec 5;23(1):279. doi: 10.1186/s12911-023-02372-4. BMC Med Inform Decis Mak. 2023. PMID: 38053104 Free PMC article. Review.
References
-
- Hosmer DW, Jr, Lemeshow S, Sturdivant RX. Applied logistic regression. New York: Wiley; 2013.
-
- Riley RD, Ensor J, Snell KI, Harrell FE, Martin GP, Reitsma JB, Moons KG, Collins G, van Meden M. Calculating the sample size required for developing a clinical prediction model. Bmj 2020;368. - PubMed
MeSH terms
LinkOut - more resources
Full Text Sources