Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2022 Apr 2;22(1):89.
doi: 10.1186/s12911-022-01811-y.

Privacy-preserving logistic regression with secret sharing

Affiliations

Privacy-preserving logistic regression with secret sharing

Ali Reza Ghavamipour et al. BMC Med Inform Decis Mak. .

Abstract

Background: Logistic regression (LR) is a widely used classification method for modeling binary outcomes in many medical data classification tasks. Researchers that collect and combine datasets from various data custodians and jurisdictions can greatly benefit from the increased statistical power to support their analysis goals. However, combining data from different sources creates serious privacy concerns that need to be addressed.

Methods: In this paper, we propose two privacy-preserving protocols for performing logistic regression with the Newton-Raphson method in the estimation of parameters. Our proposals are based on secure Multi-Party Computation (MPC) and tailored to the honest majority and dishonest majority security settings.

Results: The proposed protocols are evaluated against both synthetic and real-world datasets in terms of efficiency and accuracy, and a comparison is made with the ordinary logistic regression. The experimental results demonstrate that the proposed protocols are highly efficient and accurate.

Conclusions: Our work introduces two iterative algorithms to enable the distributed training of a logistic regression model in a privacy-preserving manner. The implementation results show that our algorithms can handle large datasets from multiple sources.

Keywords: Logistic regression; Multi-party computation; Newton–Raphson; Privacy-preserving; Secret sharing.

PubMed Disclaimer

Conflict of interest statement

The authors declare that they have no competing interests.

Figures

Fig. 1
Fig. 1
Efficiency comparison for increasing number of records using accurate algorithm 1
Fig. 2
Fig. 2
Efficiency comparison for increasing number of features using accurate algorithm 1
Fig. 3
Fig. 3
Efficiency comparison for increasing number of records using approximation-based algorithm 2
Fig. 4
Fig. 4
Efficiency comparison for increasing number of features using approximation-based algorithm 2

Similar articles

Cited by

References

    1. Hosmer DW, Jr, Lemeshow S, Sturdivant RX. Applied logistic regression. New York: Wiley; 2013.
    1. Boxwala AA, Kim J, Grillo JM, Ohno-Machado L. Using statistical and machine learning to help institutions detect suspicious access to electronic health records. J Am Med Inform Assoc. 2011;18(4):498–505. doi: 10.1136/amiajnl-2011-000217. - DOI - PMC - PubMed
    1. Riley RD, Ensor J, Snell KI, Harrell FE, Martin GP, Reitsma JB, Moons KG, Collins G, van Meden M. Calculating the sample size required for developing a clinical prediction model. Bmj 2020;368. - PubMed
    1. Jagadeesh KA, Wu DJ, Birgmeier JA, Boneh D, Bejerano G. Deriving genomic diagnoses without revealing patient genomes. Science. 2017;357(6352):692–695. doi: 10.1126/science.aam9710. - DOI - PubMed
    1. Wu Y, Jiang X, Kim J, Ohno-Machado L. Grid Binary LOgistic REgression (GLORE): building shared models without sharing data. J Am Med Inform Assoc. 2012;19(5):758–764. 10.1136/amiajnl-2012-000862. - PMC - PubMed

LinkOut - more resources