Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2012 May;16(3):413-23.
doi: 10.1109/TITB.2012.2185850. Epub 2012 Jan 27.

Anonymization of longitudinal electronic medical records

Affiliations

Anonymization of longitudinal electronic medical records

Acar Tamersoy et al. IEEE Trans Inf Technol Biomed. 2012 May.

Abstract

Electronic medical record (EMR) systems have enabled healthcare providers to collect detailed patient information from the primary care domain. At the same time, longitudinal data from EMRs are increasingly combined with biorepositories to generate personalized clinical decision support protocols. Emerging policies encourage investigators to disseminate such data in a deidentified form for reuse and collaboration, but organizations are hesitant to do so because they fear such actions will jeopardize patient privacy. In particular, there are concerns that residual demographic and clinical features could be exploited for reidentification purposes. Various approaches have been developed to anonymize clinical data, but they neglect temporal information and are, thus, insufficient for emerging biomedical research paradigms. This paper proposes a novel approach to share patient-specific longitudinal data that offers robust privacy guarantees, while preserving data utility for many biomedical investigations. Our approach aggregates temporal and diagnostic information using heuristics inspired from sequence alignment and clustering methods. We demonstrate that the proposed approach can generate anonymized data that permit effective biomedical analysis using several patient cohorts derived from the EMR system of the Vanderbilt University Medical Center.

PubMed Disclaimer

Figures

Fig. 1
Fig. 1
Longitudinal data privacy problem. (a) Longitudinal research data. (b) Identified EMR. (c) 2-anonymization based on the proposed approach.
Fig. 2
Fig. 2
General architecture of the longitudinal data anonymization process.
Fig. 3
Fig. 3
Example of the DGH structure for Age.
Fig. 4
Fig. 4
Example of the hypertension subtree in the ICD DGH.
Fig. 5
Fig. 5
Matrices i, a, and r for the subset of records from Fig. 1. This alignment uses the DGHs in Figs. 3 and 4 and assumes that wICD = wAge = 0.5.
Fig. 6
Fig. 6
Comparison of information loss for D50Pop using various k values.
Fig. 7
Fig. 7
Comparison of query answering accuracy for D50Pop,D4Pop, and D4Smp using various k values.
Fig. 8
Fig. 8
A comparison of information loss for D50Pop using various wICD and wAge values.

Similar articles

Cited by

References

    1. Blumenthal D. Stimulating the adoption of health information technology. New Engl. J. Med. 2009;vol. 360(no. 15):1477–1479. - PubMed
    1. Ludwick DA, Doucette J. Adopting electronic medical records in primary care: Lessons learned from health information systems implementation experience in seven countries. Int. J. Med. Informat. 2009;vol. 78:22–31. - PubMed
    1. Jha AK, DesRoches CM, Campbell EG, Donelan K, Rao SR, Ferris TG, Shields A, Rosenbaum S, Blumenthal D. Use of electronic health records in U.S. hospitals. New Engl. J. Med. 2009;vol. 360:1628–1638. - PubMed
    1. Dean BB, Lam J, Natoli JL, Butler Q, Aguilar D, Nordyke RJ. Review: Use of electronic medical records for health outcomes research: A literature review. Med. Care Res. Rev. 2009;vol. 66:611–638. - PubMed
    1. Holzer K, Gall W. Utilizing IHE-based electronic health record systems for secondary use. Methods Inf. Med. 2011;vol. 50(no. 4):319–325. - PubMed

Publication types