Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2018 Mar 1;25(3):315-320.
doi: 10.1093/jamia/ocx129.

Assessing privacy risks in population health publications using a checklist-based approach

Affiliations

Assessing privacy risks in population health publications using a checklist-based approach

Christine M O'Keefe et al. J Am Med Inform Assoc. .

Abstract

Objective: Recent growth in the number of population health researchers accessing detailed datasets, either on their own computers or through virtual data centers, has the potential to increase privacy risks. In response, a checklist for identifying and reducing privacy risks in population health analysis outputs has been proposed for use by researchers themselves. In this study we explore the usability and reliability of such an approach by investigating whether different users identify the same privacy risks on applying the checklist to a sample of publications.

Methods: The checklist was applied to a sample of 100 academic population health publications distributed among 5 readers. Cohen's κ was used to measure interrater agreement.

Results: Of the 566 instances of statistical output types found in the 100 publications, the most frequently occurring were counts, summary statistics, plots, and model outputs. Application of the checklist identified 128 outputs (22.6%) with potential privacy concerns. Most of these were associated with the reporting of small counts. Among these identified outputs, the readers found no substantial actual privacy concerns when context was taken into account. Interrater agreement for identifying potential privacy concerns was generally good.

Conclusion: This study has demonstrated that a checklist can be a reliable tool to assist researchers with anonymizing analysis outputs in population health research. This further suggests that such an approach may have the potential to be developed into a broadly applicable standard providing consistent confidentiality protection across multiple analyses of the same data.

Keywords: biomedical research; confidentiality; data anonymization; health services research; privacy.

PubMed Disclaimer

Figures

Figure 1.
Figure 1.
Number of outputs of each type among the 566 statistical outputs identified in the publications of the sample of 100 publications, also showing the number of outputs of each type that were identified by one or more of the step 1 anonymity tests as applied by one or more readers.

References

    1. Safran C, Bloomrosen M, Hammond WE, et al.Toward a national framework for the secondary use of health data: an American Medical Informatics Association White Paper. J Am Med Inform Assoc. 2007;14:1–9. - PMC - PubMed
    1. Weiner MG, Embi PJ. Toward reuse of clinical data for research and quality improvement: the end of the beginning? Ann Intern Med. 2009;151:359–60. - PubMed
    1. O’Keefe CM, Chipperfield JO. A summary of attack methods and confidentiality protection measures for fully automated remote analysis systems. Int Stat Rev. 2013;81:426–55.
    1. O’Keefe CM, Westcott M, O’Sullivan M, Ickowicz A, Churches T. Anonymization for outputs of population health and health services research conducted via an online data center. J Am Med Inform Assoc. 2017;24:544–49. - PMC - PubMed
    1. O'Keefe CM, Rubin DB. Individual privacy versus public good: protecting confidentiality in health research. Stat Med. 2015;34:3081–103. - PubMed