Comparative Study

. 2013 Jan 1;20(1):84-94.

doi: 10.1136/amiajnl-2012-001012. Epub 2012 Aug 2.

Large-scale evaluation of automated clinical note de-identification and its impact on information extraction

Louise Deleger¹, Katalin Molnar, Guergana Savova, Fei Xia, Todd Lingren, Qi Li, Keith Marsolo, Anil Jegga, Megan Kaiser, Laura Stoutenborough, Imre Solti

Affiliations

PMID: 22859645
PMCID: PMC3555323
DOI: 10.1136/amiajnl-2012-001012

Comparative Study

Large-scale evaluation of automated clinical note de-identification and its impact on information extraction

Louise Deleger et al. J Am Med Inform Assoc. 2013.

. 2013 Jan 1;20(1):84-94.

doi: 10.1136/amiajnl-2012-001012. Epub 2012 Aug 2.

Authors

Louise Deleger¹, Katalin Molnar, Guergana Savova, Fei Xia, Todd Lingren, Qi Li, Keith Marsolo, Anil Jegga, Megan Kaiser, Laura Stoutenborough, Imre Solti

Affiliation

¹ Division of Biomedical Informatics, Cincinnati Children's Hospital Medical Center, Cincinnati, OH 45229-3039, USA.

PMID: 22859645
PMCID: PMC3555323
DOI: 10.1136/amiajnl-2012-001012

Abstract

Objective: (1) To evaluate a state-of-the-art natural language processing (NLP)-based approach to automatically de-identify a large set of diverse clinical notes. (2) To measure the impact of de-identification on the performance of information extraction algorithms on the de-identified documents.

Material and methods: A cross-sectional study that included 3503 stratified, randomly selected clinical notes (over 22 note types) from five million documents produced at one of the largest US pediatric hospitals. Sensitivity, precision, F value of two automated de-identification systems for removing all 18 HIPAA-defined protected health information elements were computed. Performance was assessed against a manually generated 'gold standard'. Statistical significance was tested. The automated de-identification performance was also compared with that of two humans on a 10% subsample of the gold standard. The effect of de-identification on the performance of subsequent medication extraction was measured.

Results: The gold standard included 30 815 protected health information elements and more than one million tokens. The most accurate NLP method had 91.92% sensitivity (R) and 95.08% precision (P) overall. The performance of the system was indistinguishable from that of human annotators (annotators' performance was 92.15%(R)/93.95%(P) and 94.55%(R)/88.45%(P) overall while the best system obtained 92.91%(R)/95.73%(P) on same text). The impact of automated de-identification was minimal on the utility of the narrative notes for subsequent information extraction as measured by the sensitivity and precision of medication name extraction.

Discussion and conclusion: NLP-based de-identification shows excellent performance that rivals the performance of human annotators. Furthermore, unlike manual de-identification, the automated approach scales up to millions of documents quickly and inexpensively.

PubMed Disclaimer

Conflict of interest statement

Competing interests: None.

Figures

**Figure 1**
Descriptive statistics of the corpus. DC, discharge; ED, emergency department; H&P, history and physical; OR, operating room.

**Figure 2**
De-identification process. CRF, conditional random field; PHI, protected health information.

**Figure 3**
Number of annotated protected health information (PHI) elements for each document type. DC, discharge; ED, emergency department; H&P, history and physical; OR, operating room.

**Figure 4**
Inter-annotator agreement (IAA; F value) for each protected health information (PHI) class on the entire gold standard (annotators 1 and 2) and on the 10% common sample (annotators 1, 2, 3, and 4).

**Figure 5**
F values obtained by the systems and the humans. MCRF, Mallet conditional random field; MIST, MITRE Identification Scrubber Toolkit.

**Figure 6**
Recall variations obtained by adjusting MIST's bias parameter and using thresholds for Mallet CRF probability scores (customized systems). CRF, conditional random field; MIST, MITRE Identification Scrubber Toolkit.

See this image and copyright information in PMC

References

1. Meystre SM, Savova GK, Kipper-Schuler KC, et al. Extracting information from textual documents in the electronic health record: a review of recent research. Yearb Med Inform 2008:128–44 - PubMed
1. Hicks J. The Potential of Claims Data to Support the Measurement of Health Care Quality. Santa Monica, CA: RAND Corporation, 2003
1. Jha AK. The promise of electronic records: around the corner or down the road? JAMA 2011;306:880–1 - PubMed
1. Demner-Fushman D, Chapman WW, McDonald CJ. What can natural language processing do for clinical decision support? J Biomed Inform 2009;42:760–72 - PMC - PubMed
1. Warner JL, Anick P, Hong P, et al. Natural language processing and the oncologic history: is there a match? J Oncol Pract 2011;7:e15–19 - PMC - PubMed

Publication types

Actions
Actions
Actions

MeSH terms

Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions

Grants and funding

LinkOut - more resources

Full Text Sources
Other Literature Sources
- The Lens - Patent Citations Database
- scite Smart Citations

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Large-scale evaluation of automated clinical note de-identification and its impact on information extraction

Affiliation

Large-scale evaluation of automated clinical note de-identification and its impact on information extraction

Authors

Affiliation

Abstract

Conflict of interest statement

Figures

References

Publication types

MeSH terms

Grants and funding

LinkOut - more resources

Full Text Sources

Other Literature Sources