Comparative Study

. 2018 Feb 15;13(2):e0192360.

doi: 10.1371/journal.pone.0192360. eCollection 2018.

Comparing deep learning and concept extraction based methods for patient phenotyping from clinical narratives

Sebastian Gehrmann^{1

2}, Franck Dernoncourt^{1

3

4}, Yeran Li^{1

5}, Eric T Carlson^{1

6}, Joy T Wu^{1

5}, Jonathan Welt^{1

7}, John Foote Jr^{1

8}, Edward T Moseley^{1

9}, David W Grant^{1

10}, Patrick D Tyler^{1

11}, Leo A Celi^{1

3}

Affiliations

¹ MIT Critical Data, Laboratory for Computational Physiology, Cambridge, MA, United States of America.
² Harvard SEAS, Harvard University, Cambridge, MA, United States of America.
³ Massachusetts Institute of Technology, Cambridge, MA, United States of America.
⁴ Adobe Research, San Jose, CA, United States of America.
⁵ Harvard T.H. Chan School of Public Health, Cambridge, MA, United States of America.
⁶ Philips Research North America, Cambridge, MA, United States of America.
⁷ Wellman Center for Photomedicine, Massachusetts General Hospital, Boston, MA, United States of America.
⁸ Tufts University School of Medicine, Cambridge, MA, United States of America.
⁹ College of Science and Mathematics, University of Massachusetts, Boston, MA, United States of America.
¹⁰ Department of Surgery, Division of Plastic and Reconstructive Surgery, Washington University School of Medicine, St. Louis, MO, United States of America.
¹¹ Department of Internal Medicine, Beth Israel Deaconess Medical Center, Boston, MA, United States of America.

PMID: 29447188
PMCID: PMC5813927
DOI: 10.1371/journal.pone.0192360

Comparative Study

Comparing deep learning and concept extraction based methods for patient phenotyping from clinical narratives

Sebastian Gehrmann et al. PLoS One. 2018.

. 2018 Feb 15;13(2):e0192360.

doi: 10.1371/journal.pone.0192360. eCollection 2018.

Authors

Affiliations

¹ MIT Critical Data, Laboratory for Computational Physiology, Cambridge, MA, United States of America.
² Harvard SEAS, Harvard University, Cambridge, MA, United States of America.
³ Massachusetts Institute of Technology, Cambridge, MA, United States of America.
⁴ Adobe Research, San Jose, CA, United States of America.
⁵ Harvard T.H. Chan School of Public Health, Cambridge, MA, United States of America.
⁶ Philips Research North America, Cambridge, MA, United States of America.
⁷ Wellman Center for Photomedicine, Massachusetts General Hospital, Boston, MA, United States of America.
⁸ Tufts University School of Medicine, Cambridge, MA, United States of America.
⁹ College of Science and Mathematics, University of Massachusetts, Boston, MA, United States of America.
¹⁰ Department of Surgery, Division of Plastic and Reconstructive Surgery, Washington University School of Medicine, St. Louis, MO, United States of America.
¹¹ Department of Internal Medicine, Beth Israel Deaconess Medical Center, Boston, MA, United States of America.

PMID: 29447188
PMCID: PMC5813927
DOI: 10.1371/journal.pone.0192360

Abstract

In secondary analysis of electronic health records, a crucial task consists in correctly identifying the patient cohort under investigation. In many cases, the most valuable and relevant information for an accurate classification of medical conditions exist only in clinical narratives. Therefore, it is necessary to use natural language processing (NLP) techniques to extract and evaluate these narratives. The most commonly used approach to this problem relies on extracting a number of clinician-defined medical concepts from text and using machine learning techniques to identify whether a particular patient has a certain condition. However, recent advances in deep learning and NLP enable models to learn a rich representation of (medical) language. Convolutional neural networks (CNN) for text classification can augment the existing techniques by leveraging the representation of language to learn which phrases in a text are relevant for a given medical condition. In this work, we compare concept extraction based methods with CNNs and other commonly used models in NLP in ten phenotyping tasks using 1,610 discharge summaries from the MIMIC-III database. We show that CNNs outperform concept extraction based methods in almost all of the tasks, with an improvement in F1-score of up to 26 and up to 7 percentage points in area under the ROC curve (AUC). We additionally assess the interpretability of both approaches by presenting and evaluating methods that calculate and extract the most salient phrases for a prediction. The results indicate that CNNs are a valid alternative to existing approaches in patient phenotyping and cohort identification, and should be further investigated. Moreover, the deep learning approach presented in this paper can be used to assist clinicians during chart review or support the extraction of billing codes from text by identifying and highlighting relevant phrases for various medical conditions.

PubMed Disclaimer

Conflict of interest statement

Competing Interests: The authors have declared that no competing interests exist.

Figures

**Fig 1. Overview of the basic CNN architecture.**
(A) Each word within a discharge note is represented as its word embedding. In this example, both instances of the word “and” will have the same embedding. (B) Convolutions of different widths are used to learn filters that are applied to word sequences of the corresponding length. The convolution K2 with width 2 in the example looks at all 10 combinations of neighboring two words and output one value each. There can be multiple feature maps for each convolution width. (C) The multiple resulting vectors are reduced to only the highest value (the one with the most signaling power) for each of the different convolutions. (D) The final prediction (“Does the phenotype apply to the patient?”) is made by computing a weighted combination of the pooled values and applying a sigmoid function, similar to a logistic regression. This figure is adapted with permission from Kim [33].

**Fig 2. Comparison of achieved F1-scores across all tested phenotypes.**
The left three models directly classify from text, the right two models are concept-extraction based. The CNN outperforms the other models on most tasks.

**Fig 3. Impact of phrase length on model performance.**
The figure shows the change in F1-score between a model that considers only single words and a model that phrases up to a length of 5.

See this image and copyright information in PMC

References

1. Data MC. Secondary Analysis of Electronic Health Records. Springer; 2016. - PubMed
1. Charles D, Gabriel M, Furukawa MF. Adoption of electronic health record systems among US non-federal acute care hospitals: 2008-2012. ONC data brief. 2013;9:1–9.
1. Saeed M, Lieu C, Raber G, Mark RG. MIMIC II: a massive temporal ICU patient database to support research in intelligent patient monitoring. In: Computers in Cardiology, 2002. IEEE; 2002. p. 641–644. - PubMed
1. Johnson AE, Pollard TJ, Shen L, Lehman LwH, Feng M, Ghassemi M, et al. MIMIC-III, a freely accessible critical care database. Scientific data. 2016;3 doi: 10.1038/sdata.2016.35 - DOI - PMC - PubMed
1. Uzuner Ö, Goldstein I, Luo Y, Kohane I. Identifying patient smoking status from medical discharge records. Journal of the American Medical Informatics Association. 2008;15(1):14–24. doi: 10.1197/jamia.M2408 - DOI - PMC - PubMed

Publication types

Actions
Actions

MeSH terms

Actions
Actions
Actions
Actions

Grants and funding

LinkOut - more resources

Full Text Sources
Other Literature Sources
- The Lens - Patent Citations Database
- scite Smart Citations

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Comparing deep learning and concept extraction based methods for patient phenotyping from clinical narratives

Affiliations

Comparing deep learning and concept extraction based methods for patient phenotyping from clinical narratives

Authors

Affiliations

Abstract

Conflict of interest statement

Figures

References

Publication types

MeSH terms

Grants and funding

LinkOut - more resources

Full Text Sources

Other Literature Sources