Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2014 Oct:51:100-6.
doi: 10.1016/j.jbi.2014.04.013. Epub 2014 Apr 21.

Improving search over Electronic Health Records using UMLS-based query expansion through random walks

Affiliations
Free article

Improving search over Electronic Health Records using UMLS-based query expansion through random walks

David Martinez et al. J Biomed Inform. 2014 Oct.
Free article

Abstract

Objective: Most of the information in Electronic Health Records (EHRs) is represented in free textual form. Practitioners searching EHRs need to phrase their queries carefully, as the record might use synonyms or other related words. In this paper we show that an automatic query expansion method based on the Unified Medicine Language System (UMLS) Metathesaurus improves the results of a robust baseline when searching EHRs.

Materials and methods: The method uses a graph representation of the lexical units, concepts and relations in the UMLS Metathesaurus. It is based on random walks over the graph, which start on the query terms. Random walks are a well-studied discipline in both Web and Knowledge Base datasets.

Results: Our experiments over the TREC Medical Record track show improvements in both the 2011 and 2012 datasets over a strong baseline.

Discussion: Our analysis shows that the success of our method is due to the automatic expansion of the query with extra terms, even when they are not directly related in the UMLS Metathesaurus. The terms added in the expansion go beyond simple synonyms, and also add other kinds of topically related terms.

Conclusions: Expansion of queries using related terms in the UMLS Metathesaurus beyond synonymy is an effective way to overcome the gap between query and document vocabularies when searching for patient cohorts.

Keywords: Algorithms; Data mining; Information storage and retrieval; Natural language processing; Semantics.

PubMed Disclaimer

Publication types

MeSH terms

LinkOut - more resources