A secure distributed logistic regression protocol for the detection of rare adverse drug events
- PMID: 22871397
- PMCID: PMC3628043
- DOI: 10.1136/amiajnl-2011-000735
A secure distributed logistic regression protocol for the detection of rare adverse drug events
Abstract
Background: There is limited capacity to assess the comparative risks of medications after they enter the market. For rare adverse events, the pooling of data from multiple sources is necessary to have the power and sufficient population heterogeneity to detect differences in safety and effectiveness in genetic, ethnic and clinically defined subpopulations. However, combining datasets from different data custodians or jurisdictions to perform an analysis on the pooled data creates significant privacy concerns that would need to be addressed. Existing protocols for addressing these concerns can result in reduced analysis accuracy and can allow sensitive information to leak.
Objective: To develop a secure distributed multi-party computation protocol for logistic regression that provides strong privacy guarantees.
Methods: We developed a secure distributed logistic regression protocol using a single analysis center with multiple sites providing data. A theoretical security analysis demonstrates that the protocol is robust to plausible collusion attacks and does not allow the parties to gain new information from the data that are exchanged among them. The computational performance and accuracy of the protocol were evaluated on simulated datasets.
Results: The computational performance scales linearly as the dataset sizes increase. The addition of sites results in an exponential growth in computation time. However, for up to five sites, the time is still short and would not affect practical applications. The model parameters are the same as the results on pooled raw data analyzed in SAS, demonstrating high model accuracy.
Conclusion: The proposed protocol and prototype system would allow the development of logistic regression models in a secure manner without requiring the sharing of personal health information. This can alleviate one of the key barriers to the establishment of large-scale post-marketing surveillance programs. We extended the secure protocol to account for correlations among patients within sites through generalized estimating equations, and to accommodate other link functions by extending it to generalized linear models.
Conflict of interest statement
Figures


Similar articles
-
Privacy-preserving logistic regression with secret sharing.BMC Med Inform Decis Mak. 2022 Apr 2;22(1):89. doi: 10.1186/s12911-022-01811-y. BMC Med Inform Decis Mak. 2022. PMID: 35366870 Free PMC article.
-
Secure and scalable deduplication of horizontally partitioned health data for privacy-preserving distributed statistical computation.BMC Med Inform Decis Mak. 2017 Jan 3;17(1):1. doi: 10.1186/s12911-016-0389-x. BMC Med Inform Decis Mak. 2017. PMID: 28049465 Free PMC article.
-
High performance logistic regression for privacy-preserving genome analysis.BMC Med Genomics. 2021 Jan 20;14(1):23. doi: 10.1186/s12920-020-00869-9. BMC Med Genomics. 2021. PMID: 33472626 Free PMC article.
-
Leveraging the entire cohort in drug safety monitoring: part 1 methods for sequential surveillance that use regression adjustment or weighting to control confounding in a multisite, rare event, distributed data setting.J Clin Epidemiol. 2019 Aug;112:77-86. doi: 10.1016/j.jclinepi.2019.04.012. Epub 2019 May 18. J Clin Epidemiol. 2019. PMID: 31108199 Review.
-
Pediatric post-marketing safety systems in North America: assessment of the current status.Pharmacoepidemiol Drug Saf. 2015 Aug;24(8):785-92. doi: 10.1002/pds.3813. Epub 2015 Jun 22. Pharmacoepidemiol Drug Saf. 2015. PMID: 26098297 Review.
Cited by
-
Privacy-protecting estimation of adjusted risk ratios using modified Poisson regression in multi-center studies.BMC Med Res Methodol. 2019 Dec 5;19(1):228. doi: 10.1186/s12874-019-0878-6. BMC Med Res Methodol. 2019. PMID: 31805872 Free PMC article.
-
Combining distributed regression and propensity scores: a doubly privacy-protecting analytic method for multicenter research.Clin Epidemiol. 2018 Nov 27;10:1773-1786. doi: 10.2147/CLEP.S178163. eCollection 2018. Clin Epidemiol. 2018. PMID: 30568510 Free PMC article.
-
Analytic and Data Sharing Options in Real-World Multidatabase Studies of Comparative Effectiveness and Safety of Medical Products.Clin Pharmacol Ther. 2020 Apr;107(4):834-842. doi: 10.1002/cpt.1754. Epub 2020 Jan 24. Clin Pharmacol Ther. 2020. PMID: 31869442 Free PMC article. Review.
-
Privacy-protecting multivariable-adjusted distributed regression analysis for multi-center pediatric study.Pediatr Res. 2020 May;87(6):1086-1092. doi: 10.1038/s41390-019-0596-0. Epub 2019 Oct 2. Pediatr Res. 2020. PMID: 31578038 Free PMC article.
-
Observational Studies of Drug Safety in Multi-Database Studies: Methodological Challenges and Opportunities.EGEMS (Wash DC). 2016 May 18;4(1):1221. doi: 10.13063/2327-9214.1221. eCollection 2016. EGEMS (Wash DC). 2016. PMID: 27376096 Free PMC article.
References
-
- Hoffman J, Doloresco F, Vermeulen L, et al. Projecting future drug expenditures. Am J Health Syst Pharm 2010;67:919–28 - PubMed
-
- Couzin J. Gaps in the safety Net. Science 2005;307:196–8 - PubMed
-
- Gough S. Post-marketing surveillance: a UK/European perspective. Curr Med Res Opin 2005;21:565–70 - PubMed
-
- Budnitz D, Pollock D, Weidenbach K, et al. National surveillance of emergency department visits for outpatient adverse drug events. JAMA 2006;296: 1858–66 - PubMed
Publication types
MeSH terms
Grants and funding
LinkOut - more resources
Full Text Sources
Other Literature Sources
Medical