Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2018 Jul:83:63-72.
doi: 10.1016/j.jbi.2018.05.014. Epub 2018 May 22.

Extracting similar terms from multiple EMR-based semantic embeddings to support chart reviews

Affiliations

Extracting similar terms from multiple EMR-based semantic embeddings to support chart reviews

Cheng Ye et al. J Biomed Inform. 2018 Jul.

Abstract

Objective: Word embeddings project semantically similar terms into nearby points in a vector space. When trained on clinical text, these embeddings can be leveraged to improve keyword search and text highlighting. In this paper, we present methods to refine the selection process of similar terms from multiple EMR-based word embeddings, and evaluate their performance quantitatively and qualitatively across multiple chart review tasks.

Materials and methods: Word embeddings were trained on each clinical note type in an EMR. These embeddings were then combined, weighted, and truncated to select a refined set of similar terms to be used in keyword search and text highlighting. To evaluate their quality, we measured the similar terms' information retrieval (IR) performance using precision-at-K (P@5, P@10). Additionally a user study evaluated users' search term preferences, while a timing study measured the time to answer a question from a clinical chart.

Results: The refined terms outperformed the baseline method's information retrieval performance (e.g., increasing the average P@5 from 0.48 to 0.60). Additionally, the refined terms were preferred by most users, and reduced the average time to answer a question.

Conclusions: Clinical information can be more quickly retrieved and synthesized when using semantically similar term from multiple embeddings.

Keywords: Clinical similar terms; Electronic medical records (EMR); Highlighting; Query expansion; Search engines; Semantic embeddings.

PubMed Disclaimer

Conflict of interest statement

Competing interests: None.

Figures

Figure 1.
Figure 1.
Similar terms of “cancer” from the “Clinic Note” EMR-subset embedding broken down by intrasubset similarity, inter-subsets similarity, and harmonic similarity. The harmonic similarity is used for ranking terms.
Figure 2.
Figure 2.
Example of expanded document quality analysis for “epilepsy.” The proportion of high similarity terms (i.e. terms that have similarities larger than 0.60 while 1.0 is the maximum value) decreases with similar term expansion.
Figure 3.
Figure 3.
Example of similarity cutoff computation. Since all terms have similarities larger than 0.40, the y-axis starts from 0.3. Similarity cutoff is at the “elbow” of the similarity curve (arrow).
Figure 4.
Figure 4.
Screenshot of the preference survey. An introduction is provided, followed by 14 questions that ask the participant to choose the best list to expand a keyword. List orders were randomized to hide source methods.

Similar articles

Cited by

References

    1. Rasmussen LV, The electronic health record for translational research, J. Cardiovasc. Transl. Res. 7 (2014) 607–614. doi:10.1007/sl2265-014-9579-z. - DOI - PMC - PubMed
    1. Chen L, Guo U, Illipparambil LC, Netherton MD, Sheshadri B, Karu E, Peterson SJ, Mehta PH, Racing Against the Clock: Internal Medicine Residents’ Time Spent On Electronic Health Records, J. Grad. Med. Educ. 8 (2016) 39–44. doi:10.4300/JGME-D-15-00240.1. - DOI - PMC - PubMed
    1. Hripcsak G, Vawdrey DK, Fred MR, Bostwick SB, Use of electronic clinical documentation: time spent and team interactions, J Am Med Inf. Assoc. 18 (2011) 112–117. doi:10.1136/jamia.2010.008441. - DOI - PMC - PubMed
    1. Lai KH, Topaz M, Goss FR, Zhou L, Automated misspelling detection and correction in clinical free-text records., J. Biomed. Inform. 55 (2015) 188–95. doi:10.1016/j.jbi.2015.04.008. - DOI - PubMed
    1. Henriksson A, Moen H, Skeppstedt M, Daudaravicius V, Duneld M, Synonym extraction and abbreviation expansion with ensembles of semantic spaces., J. Biomed. Semantics. 5 (2014) 6. doi:10.1186/2041-1480-5-6. - DOI - PMC - PubMed

Publication types

MeSH terms

LinkOut - more resources