Privacy-protecting multivariable-adjusted distributed regression analysis for multi-center pediatric study
- PMID: 31578038
- PMCID: PMC7113085
- DOI: 10.1038/s41390-019-0596-0
Privacy-protecting multivariable-adjusted distributed regression analysis for multi-center pediatric study
Abstract
Background: Privacy-protecting analytic approaches without centralized pooling of individual-level data, such as distributed regression, are particularly important for vulnerable populations, such as children, but these methods have not yet been tested in multi-center pediatric studies.
Methods: Using the electronic health data from 34 healthcare institutions in the National Patient-Centered Clinical Research Network (PCORnet), we fit 12 multivariable-adjusted linear regression models to assess the associations of antibiotic use <24 months of age with body mass index z-score at 48 to <72 months of age. We ran these models using pooled individual-level data and conventional multivariable-adjusted regression (reference method), as well as using the more privacy-protecting pooled summary-level intermediate statistics and distributed regression technique. We compared the results from these two methods.
Results: Pooled individual-level and distributed linear regression analyses produced virtually identical parameter estimates and standard errors. Across all 12 models, the maximum difference in any of the parameter estimates or standard errors was 4.4833 × 10-10.
Conclusions: We demonstrated empirically the feasibility and validity of distributed linear regression analysis using only summary-level information within a large multi-center study of children. This approach could enable expanded opportunities for multi-center pediatric research, especially when sharing of granular individual-level data is challenging.
Conflict of interest statement
DISCLOSURE
The authors declare no conflict of interest. The funding organization was not involved in the design of the study; the collection, analysis, and interpretation of the data; or the decision to approve publication of the finished manuscript.
Figures

Comment in
-
Privacy-preserving statistical analyses in Learning Health Systems.Pediatr Res. 2020 May;87(6):978-979. doi: 10.1038/s41390-020-0835-4. Epub 2020 Mar 14. Pediatr Res. 2020. PMID: 32172277 No abstract available.
Similar articles
-
Privacy-protecting estimation of adjusted risk ratios using modified Poisson regression in multi-center studies.BMC Med Res Methodol. 2019 Dec 5;19(1):228. doi: 10.1186/s12874-019-0878-6. BMC Med Res Methodol. 2019. PMID: 31805872 Free PMC article.
-
Combining distributed regression and propensity scores: a doubly privacy-protecting analytic method for multicenter research.Clin Epidemiol. 2018 Nov 27;10:1773-1786. doi: 10.2147/CLEP.S178163. eCollection 2018. Clin Epidemiol. 2018. PMID: 30568510 Free PMC article.
-
Validity of Privacy-Protecting Analytical Methods That Use Only Aggregate-Level Information to Conduct Multivariable-Adjusted Analysis in Distributed Data Networks.Am J Epidemiol. 2019 Apr 1;188(4):709-723. doi: 10.1093/aje/kwy265. Am J Epidemiol. 2019. PMID: 30535131 Free PMC article.
-
Findings from 2017 on Consumer Health Informatics and Education: Health Data Access and Sharing.Yearb Med Inform. 2018 Aug;27(1):163-169. doi: 10.1055/s-0038-1641218. Epub 2018 Aug 29. Yearb Med Inform. 2018. PMID: 30157519 Free PMC article. Review.
-
Driving toward guiding principles: a goal for privacy, confidentiality, and security of health information.J Am Med Inform Assoc. 1999 Mar-Apr;6(2):122-33. doi: 10.1136/jamia.1999.0060122. J Am Med Inform Assoc. 1999. PMID: 10094065 Free PMC article. Review.
Cited by
-
Review of Clinical Research Informatics.Yearb Med Inform. 2020 Aug;29(1):193-202. doi: 10.1055/s-0040-1701988. Epub 2020 Aug 21. Yearb Med Inform. 2020. PMID: 32823316 Free PMC article. Review.
-
Privacy-preserving estimation of an optimal individualized treatment rule: a case study in maximizing time to severe depression-related outcomes.Lifetime Data Anal. 2022 Jul;28(3):512-542. doi: 10.1007/s10985-022-09554-8. Epub 2022 May 2. Lifetime Data Anal. 2022. PMID: 35499604 Free PMC article.
-
Analytic and Data Sharing Options in Real-World Multidatabase Studies of Comparative Effectiveness and Safety of Medical Products.Clin Pharmacol Ther. 2020 Apr;107(4):834-842. doi: 10.1002/cpt.1754. Epub 2020 Jan 24. Clin Pharmacol Ther. 2020. PMID: 31869442 Free PMC article. Review.
References
-
- Cheng TL, Bogue CW, Dover GJ. The Next 7 Great Achievements in Pediatric Research. Pediatrics 2017; 139. - PubMed
-
- Curtis LH, Brown J, Platt R. Four health data networks illustrate the potential for a shared national multipurpose big-data network. Health Aff (Millwood) 2014; 33:1178–1186. - PubMed
-
- Department of Health and Human Services. The Code of Federal Regulations. Title 45, Subtitle A, Subchapter A, Part 46: Protection of human subjects. (https://www.ecfr.gov/cgi-bin/retrieveECFR?gp=&SID=83cd09e1c0f5c6937cd9d7...).
Publication types
MeSH terms
Substances
LinkOut - more resources
Full Text Sources
Medical
Miscellaneous