A software tool for removing patient identifying information from clinical documents
- PMID: 18579831
- PMCID: PMC2528047
- DOI: 10.1197/jamia.M2702
A software tool for removing patient identifying information from clinical documents
Abstract
We created a software tool that accurately removes all patient identifying information from various kinds of clinical data documents, including laboratory and narrative reports. We created the Medical De-identification System (MeDS), a software tool that de-identifies clinical documents, and performed 2 evaluations. Our first evaluation used 2,400 Health Level Seven (HL7) messages from 10 different HL7 message producers. After modifying the software based on the results of this first evaluation, we performed a second evaluation using 7,190 pathology report HL7 messages. We compared the results of MeDS de-identification process to a gold standard of human review to find identifying strings. For both evaluations, we calculated the number of successful scrubs, missed identifiers, and over-scrubs committed by MeDS and evaluated the readability and interpretability of the scrubbed messages. We categorized all missed identifiers into 3 groups: (1) complete HIPAA-specified identifiers, (2) HIPAA-specified identifier fragments, (3) non-HIPAA-specified identifiers (such as provider names and addresses). In the results of the first-pass evaluation, MeDS scrubbed 11,273 (99.06%) of the 11,380 HIPAA-specified identifiers and 38,095 (98.26%) of the 38,768 non-HIPAA-specified identifiers. In our second evaluation (status postmodification to the software), MeDS scrubbed 79,993 (99.47%) of the 80,418 HIPAA-specified identifiers and 12,689 (96.93%) of the 13,091 non-HIPAA-specified identifiers. Approximately 95% of scrubbed messages were both readable and interpretable. We conclude that MeDS successfully de-identified a wide range of medical documents from numerous sources and creates scrubbed reports that retain their interpretability, thereby maintaining their usefulness for research.
Figures





References
-
- NOVA: Public Broadcasting System [homepage on the internet]. Louis Lasagna. Hippocratic Oath—Modern Version; 1964. Available from: http://www.pbs.org/wgbh/nova/doctors/oath_modern.html. Accessed July 17, 2008.
-
- Tilton SH. Right to privacy and confidentiality of medical records Occup Med 1996;11:17-29. - PubMed
-
- Kurtz G. EMR confidentiality and information security J Healthc Inf Manag 2003;17:41-48. - PubMed
-
- Health and Human Services HIPAA Web sitehttp://www.hhs.gov/ocr/hipaa/ 2003. Accessed July 1, 2006.
-
- U.S. Department of Health and Human Services. Standards for Privacy of Individually Identifiable Health Information; Final Rule. Code of Federal Regulations, Title 45, Parts 160 and 164. Available at: http://hhs.gov/ocr/combinedregtext.pdf. Accessed May 1, 2006.
MeSH terms
LinkOut - more resources
Full Text Sources
Other Literature Sources