Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2005 Aug 31:5:30.
doi: 10.1186/1472-6947-5-30.

Automation of a problem list using natural language processing

Affiliations

Automation of a problem list using natural language processing

Stephane Meystre et al. BMC Med Inform Decis Mak. .

Abstract

Background: The medical problem list is an important part of the electronic medical record in development in our institution. To serve the functions it is designed for, the problem list has to be as accurate and timely as possible. However, the current problem list is usually incomplete and inaccurate, and is often totally unused. To alleviate this issue, we are building an environment where the problem list can be easily and effectively maintained.

Methods: For this project, 80 medical problems were selected for their frequency of use in our future clinical field of evaluation (cardiovascular). We have developed an Automated Problem List system composed of two main components: a background and a foreground application. The background application uses Natural Language Processing (NLP) to harvest potential problem list entries from the list of 80 targeted problems detected in the multiple free-text electronic documents available in our electronic medical record. These proposed medical problems drive the foreground application designed for management of the problem list. Within this application, the extracted problems are proposed to the physicians for addition to the official problem list.

Results: The set of 80 targeted medical problems selected for this project covered about 5% of all possible diagnoses coded in ICD-9-CM in our study population (cardiovascular adult inpatients), but about 64% of all instances of these coded diagnoses. The system contains algorithms to detect first document sections, then sentences within these sections, and finally potential problems within the sentences. The initial evaluation of the section and sentence detection algorithms demonstrated a sensitivity and positive predictive value of 100% when detecting sections, and a sensitivity of 89% and a positive predictive value of 94% when detecting sentences.

Conclusion: The global aim of our project is to automate the process of creating and maintaining a problem list for hospitalized patients and thereby help to guarantee the timeliness, accuracy and completeness of this information.

PubMed Disclaimer

Figures

Figure 1
Figure 1
Information model diagram. Analysis of free-text documents results in the creation of a CDA version of the analyzed document and of an ASN.1 Problem record for each medical problem detected.
Figure 2
Figure 2
Automated Problem List system diagram. The two main components of the Automated Problem List system (the background application and the problem list management application) are displayed with the elements of our clinical information system they interact with. The ASN.1 data model used is called MultiMedia, with a GenericBigXMLTextObs data type. It allows storage of XML files as simple text but recognized as XML.
Figure 3
Figure 3
Medical problems Bayesian Network. Bayesian Network with example values for each node when analyzing the sentence "The patient presents with shortness of breath" in the "History of Present Illness" section of a "Consultation Note". Note the application of within-document context represented by the Document Type and Document Section nodes.
Figure 4
Figure 4
Training cases creation process. Sentences were first selected from the set of sentences resulting from the section and sentence detection of free-text documents. Regular expressions and a list of phrases representing possible ways of describing each of the 80 targeted problems were used for this task. The resulting pre-training cases were then augmented by a human reviewer adding the state and state phrase in each sentence. The resulting file contained 4436 training cases.
Figure 5
Figure 5
Example CDA document. XML Clinical Document Architecture version of the analyzed document.
Figure 6
Figure 6
Example rendered HTML version of the document. Example of the customized HTML version of the document, as seen if linked from the problem headache
Figure 7
Figure 7
Screenshot of the problem list management application. Problem list management application with the viewer window showing the source document of the problem headache with the source sentence highlighted in red.
Figure 8
Figure 8
Preliminary XML manipulation example. In this extract of a CDA document, the code of the Observation element (dyspnea problem) and the reference identifiers are in bold characters. Reference identifiers link Observation elements (i.e. coded problems) to content elements (i.e. sentence(s) they were extracted from).

References

    1. Weed LL. Medical records that guide and teach. N Engl J Med. 1968;278:593–600. - PubMed
    1. Weed LL. Medical records that guide and teach. N Engl J Med. 1968;278:652–657. concl. - PubMed
    1. Bayegan E, Tu S. The helpful patient record system: problem oriented and knowledge based. Proc AMIA Symp. 2002:36–40. - PMC - PubMed
    1. Campbell JR, Payne TH. A comparison of four schemes for codification of problem lists. Proc Annu Symp Comput Appl Med Care. 1994:201–205. - PMC - PubMed
    1. Campbell JR. Strategies for problem list implementation in a complex clinical enterprise. Proc AMIA Symp. 1998:285–289. - PMC - PubMed

Publication types

LinkOut - more resources