Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Review
. 2021 Dec 10;2(12):100366.
doi: 10.1016/j.patter.2021.100366.

Differential privacy for public health data: An innovative tool to optimize information sharing while protecting data confidentiality

Affiliations
Review

Differential privacy for public health data: An innovative tool to optimize information sharing while protecting data confidentiality

Amalie Dyda et al. Patterns (N Y). .

Abstract

Coronavirus disease 2019 (COVID-19) has highlighted the need for the timely collection and sharing of public health data. It is important that data sharing is balanced with protecting confidentiality. Here we discuss an innovative mechanism to protect health data, called differential privacy. Differential privacy is a mathematically rigorous definition of privacy that aims to protect against all possible adversaries. In layperson's terms, statistical noise is applied to the data so that overall patterns can be described, but data on individuals are unlikely to be extracted. One of the first use cases for health data in Australia is the development of the COVID-19 Real-Time Information System for Preparedness and Epidemic Response (CRISPER), which provides proof of concept for the use of this technology in the health sector. If successful, this will benefit future sharing of public health data.

Keywords: COVID-19; data privacy; surveillance.

PubMed Disclaimer

Conflict of interest statement

The authors declare no competing interests.

Figures

Figure 1
Figure 1
Comparative plot of the density functions of the Laplace (0; 1) and the Gaussian (0; 1) distributions Note that the Laplace distribution has a sharp “peak” at zero, while the Gaussian is more rounded. Also note that the tails of the Laplace distribution are much heavier than those for the Gaussian distribution. That is, samples drawn from the Laplace distribution are more likely to be farther away from the mean than are samples drawn from the Gaussian distribution.
Figure 2
Figure 2
Distribution of the probabilities of query responses produced by the Laplace mechanism when Alice has been diagnosed with COVID-19 (blue) and when Alice has not been diagnosed with COVID-19 (orange).
Figure 3
Figure 3
Histogram of real data (blue) compared with differentially private query responses of the same dataset (ϵ = 1/8; orange).

References

    1. Geiderman J.M., Moskop J.C., Derse A.R. Privacy and confidentiality in emergency medicine: obligations and challenges. Emerg. Med. Clin. North Am. 2006;24(3):633–656. - PMC - PubMed
    1. Thomas-Wilson S. 7000 Patient records from Women’s and Children’s hospital exposed online in embedded data. Advertiser. August 4, 2018.
    1. Australian Red Cross Blood Service Apologises for Donor Data Leak 28th October 2016. https://www.donateblood.com.au/media/news/blood-service-apologises-donor...
    1. Culnane C., Rubinstein B.I.P., Teague V. Health data in an open world. CoRR. 2017;abs/1712:05627. arXiv:1712.05627.
    1. Commissioner AGOotAI . 2018. Publication of MBS/PBS Data: Commissioner Initiated Investigation Report.

LinkOut - more resources