Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2015 Jun;13(3):178-82.
doi: 10.1089/bio.2014.0069.

The hybrid synthetic microdata platform: a method for statistical disclosure control

Affiliations

The hybrid synthetic microdata platform: a method for statistical disclosure control

Joël Kuiper et al. Biopreserv Biobank. 2015 Jun.

Abstract

Owners of biobanks are in an unfortunate position: on the one hand, they need to protect the privacy of their participants, whereas on the other, their usefulness relies on the disclosure of the data they hold. Existing methods for Statistical Disclosure Control attempt to find a balance between utility and confidentiality, but come at a cost for the analysts of the data. We outline an alternative perspective to the balance between confidentiality and utility. By combining the generation of synthetic data with the automated execution of data analyses, biobank owners can guarantee the privacy of their participants, yet allow the analysts to work in an unrestricted manner.

PubMed Disclaimer

Figures

<b>FIG. 1.</b>
FIG. 1.
Comparison of synthetic data (circles) and original data (triangles) across two correlated variables.
<b>FIG. 2.</b>
FIG. 2.
Parallel coordinates plot of several variables. Synthetic data (solid) and original data (dashed).
<b>FIG. 3.</b>
FIG. 3.
Implementation strategy for a hybrid synthetic data system.

References

    1. Chen B-C, Kifer D, LeFevre K, Machanavajjhala A. Privacy-preserving data publishing. Foundations Trends Databases 2009;2:1–167
    1. Matthews GJ, Harel O. Data confidentiality: A review of methods for statistical disclosure limitation and methods for assessing privacy. Statistics Surveys 2011;5:1–29
    1. Navarro-Arribas G, Torra V. Information fusion in data privacy: A survey. Inform Fusion 2012;13:235–244
    1. Malin B, Loukides G, Benitez K, Clayton EW. Identifiability in biobanks: Models, measures, and mitigation strategies. Human Genet 2011;130:383–392 - PMC - PubMed
    1. Prada SI, Gonzalez C, Borton J, et al. . Avoiding Disclosure of Individually Identifiable Health Information: A Literature Review. University Library of Munich, Germany; 2011

Publication types

LinkOut - more resources