Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Comparative Study
. 2020 May;87(6):1086-1092.
doi: 10.1038/s41390-019-0596-0. Epub 2019 Oct 2.

Privacy-protecting multivariable-adjusted distributed regression analysis for multi-center pediatric study

Collaborators, Affiliations
Comparative Study

Privacy-protecting multivariable-adjusted distributed regression analysis for multi-center pediatric study

Sengwee Toh et al. Pediatr Res. 2020 May.

Abstract

Background: Privacy-protecting analytic approaches without centralized pooling of individual-level data, such as distributed regression, are particularly important for vulnerable populations, such as children, but these methods have not yet been tested in multi-center pediatric studies.

Methods: Using the electronic health data from 34 healthcare institutions in the National Patient-Centered Clinical Research Network (PCORnet), we fit 12 multivariable-adjusted linear regression models to assess the associations of antibiotic use <24 months of age with body mass index z-score at 48 to <72 months of age. We ran these models using pooled individual-level data and conventional multivariable-adjusted regression (reference method), as well as using the more privacy-protecting pooled summary-level intermediate statistics and distributed regression technique. We compared the results from these two methods.

Results: Pooled individual-level and distributed linear regression analyses produced virtually identical parameter estimates and standard errors. Across all 12 models, the maximum difference in any of the parameter estimates or standard errors was 4.4833 × 10-10.

Conclusions: We demonstrated empirically the feasibility and validity of distributed linear regression analysis using only summary-level information within a large multi-center study of children. This approach could enable expanded opportunities for multi-center pediatric research, especially when sharing of granular individual-level data is challenging.

PubMed Disclaimer

Conflict of interest statement

DISCLOSURE

The authors declare no conflict of interest. The funding organization was not involved in the design of the study; the collection, analysis, and interpretation of the data; or the decision to approve publication of the finished manuscript.

Figures

Figure 1.
Figure 1.
Results from linear regression models that considered antibiotic use as a binary variable (any use vs. no use) and body mass index z-score as the continuous outcome variable among patients without complex chronic conditions, by network partner. The models included all the covariates in Table 2. Note: The values are parameter estimates for any antibiotic use (vs. no use) and their 95% confidence intervals. One of the 27 network partners was excluded from this figure due to small sample size (n=34) but its data was included in the pooled individual-level data analysis and distributed regression analysis.

Comment in

Similar articles

Cited by

References

    1. Cheng TL, Bogue CW, Dover GJ. The Next 7 Great Achievements in Pediatric Research. Pediatrics 2017; 139. - PubMed
    1. Curtis LH, Brown J, Platt R. Four health data networks illustrate the potential for a shared national multipurpose big-data network. Health Aff (Millwood) 2014; 33:1178–1186. - PubMed
    1. Currie J “Big data” versus “big brother”: on the appropriate use of large-scale data collections in pediatrics. Pediatrics 2013; 131 Suppl 2:S127–132. - PMC - PubMed
    1. Department of Health and Human Services. The Code of Federal Regulations. Title 45, Subtitle A, Subchapter A, Part 46: Protection of human subjects. (https://www.ecfr.gov/cgi-bin/retrieveECFR?gp=&SID=83cd09e1c0f5c6937cd9d7...).
    1. Simon GE, Coronado G, DeBar LL, et al. Data Sharing and Embedded Research. Ann Intern Med 2017; 167:668–670. - PMC - PubMed

Publication types

MeSH terms

Substances