Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2022 Nov 22;7(4):1755.
doi: 10.23889/ijpds.v7i4.1755. eCollection 2022.

Validating a novel deterministic privacy-preserving record linkage between administrative & clinical data: applications in stroke research

Affiliations

Validating a novel deterministic privacy-preserving record linkage between administrative & clinical data: applications in stroke research

Alisia Southwell et al. Int J Popul Data Sci. .

Abstract

Introduction: Research data combined with administrative data provides a robust resource capable of answering unique research questions. However, in cases where personal health data are encrypted, due to ethics requirements or institutional restrictions, traditional methods of deterministic and probabilistic record linkages are not feasible. Instead, privacy-preserving record linkages must be used to protect patients' personal data during data linkage.

Objectives: To determine the feasibility and validity of a deterministic privacy preserving data linkage protocol using homomorphically encrypted data.

Methods: Feasibility was measured by the number of records that successfully matched via direct identifiers. Validity was measured by the number of records that matched with multiple indirect identifiers. The threshold for feasibility and validity were both set at 95%. The datasets shared a single, direct identifier (health card number) and multiple indirect identifiers (sex and date of birth). Direct identifiers were encrypted in both datasets and then transferred to a third-party server capable of linking the encrypted identifiers without decrypting individual records. Once linked, the study team used indirect identifiers to verify the accuracy of the linkage in the final dataset.

Results: With a combination of manual and automated data transfer in a sample of 8,128 individuals, the privacy-preserving data linkage took 36 days to match to a population sample of over 3.2 million records. 99.9% of the records were successfully matched with direct identifiers, and 99.8% successfully matched with multiple indirect identifiers. We deemed the linkage both feasible and valid.

Conclusions: As combining administrative and research data becomes increasingly common, it is imperative to understand options for linking data when direct linkage is not feasible. The current linkage process ensured the privacy and security of patient data and improved data quality. While the initial implementations required significant computational and human resources, increased automation keeps the requirements within feasible bounds.

Keywords: data linkage; feasibility; personal health information; privacy; stroke.

PubMed Disclaimer

Conflict of interest statement

Conflicts of interest: There are no conflicts of interest to report.

Figures

Figure 1: Results of the privacy-preserving record linkage
Figure 1: Results of the privacy-preserving record linkage

References

    1. Hashimoto R, Brodt E, Skelly A, Dettori J. Administrative database studies: Goldmine or goose chase? Evid Based Spine Care J. 2014;05(02):074–6. 10.1055/s-0034-1390027 - DOI - PMC - PubMed
    1. Harbaugh CM, Cooper JN. Administrative databases. Semin Pediatr Surg. 2018;27(6):353–60. 10.1053/j.sempedsurg.2018.10.001 - DOI - PubMed
    1. Gavrielov-Yusim N, Friger M. Use of administrative medical databases in population-based research. J Epidemiol Community Health. 2014;68(3):283–7. 10.1136/jech-2013-202744 - DOI - PubMed
    1. Rabinstein AA. Administrative medical databases for clinical research: The good, the bad, and the ugly. Neurocrit Care. 2018;29(3):323–5. 10.1007/s12028-018-0625-6 - DOI - PubMed
    1. Van Walraven C, Austin P. Administrative database research has unique characteristics that can risk biased results. J Clin Epidemiol. 2012;65(2):126–31. 10.1016/j.jclinepi.2011.08.002 - DOI - PubMed

Publication types

Grants and funding