Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2012 Feb;45(1):71-81.
doi: 10.1016/j.jbi.2011.08.020. Epub 2011 Sep 9.

Building an automated SOAP classifier for emergency department reports

Affiliations

Building an automated SOAP classifier for emergency department reports

Danielle Mowery et al. J Biomed Inform. 2012 Feb.

Abstract

Information extraction applications that extract structured event and entity information from unstructured text can leverage knowledge of clinical report structure to improve performance. The Subjective, Objective, Assessment, Plan (SOAP) framework, used to structure progress notes to facilitate problem-specific, clinical decision making by physicians, is one example of a well-known, canonical structure in the medical domain. Although its applicability to structuring data is understood, its contribution to information extraction tasks has not yet been determined. The first step to evaluating the SOAP framework's usefulness for clinical information extraction is to apply the model to clinical narratives and develop an automated SOAP classifier that classifies sentences from clinical reports. In this quantitative study, we applied the SOAP framework to sentences from emergency department reports, and trained and evaluated SOAP classifiers built with various linguistic features. We found the SOAP framework can be applied manually to emergency department reports with high agreement (Cohen's kappa coefficients over 0.70). Using a variety of features, we found classifiers for each SOAP class can be created with moderate to outstanding performance with F(1) scores of 93.9 (subjective), 94.5 (objective), 75.7 (assessment), and 77.0 (plan). We look forward to expanding the framework and applying the SOAP classification to clinical information extraction tasks.

PubMed Disclaimer

Figures

Figure 1
Figure 1
Experimental design for determining performance with (1) all feature groups, (2) all features included through feature selection, (3) each feature group individually, and (4) each feature group individually held out.
Figure 2
Figure 2
Temporal progression of conditions mentioned in clinical text.

References

    1. Doan S, Conway M, Collier N. An Empirical Study of Sections in Classifying Disease Outbreak Reports. In: Lazakidou A, editor. Annals of Information Systems: Web-Based Applications in Healthcare and Biomedicine. Springer Science+Business Media; LLC: 2010. pp. 47–58.
    1. Aronsky D, Haug P. Diagnosing Community-Acquired Pneumonia with a Bayesian Network. Proceedings of AMIA Symposium. 1998:632–636. - PMC - PubMed
    1. Hyun S, Johnson SB, Bakken S. Exploring the Ability of Natural Language Processing to Extract Data from Nursing Narratives. CIN: Computers, Informatics, Nursing. 2009;4:215–223. - PMC - PubMed
    1. Minsuk L, Cimino J, Zhu HR, Sable C, Shanker V, Ely J, Yu H. Beyond Information Retrieval -- Medical Question Answering. AMIA Annual Symposium Proceedings. 2006:496–473. - PMC - PubMed
    1. Wang X, Chase H, Hripcsak G, Friedman C. Selecting Information for Electronic Health Records for Knowledge Acquisition. J. Biomed. Inform. 2010;43:595–601. - PMC - PubMed

Publication types

LinkOut - more resources