Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2014 Jan-Feb;21(1):64-72.
doi: 10.1136/amiajnl-2012-001159. Epub 2013 May 15.

A corpus-based approach for automated LOINC mapping

Affiliations

A corpus-based approach for automated LOINC mapping

Mustafa Fidahussein et al. J Am Med Inform Assoc. 2014 Jan-Feb.

Abstract

Objective: To determine whether the knowledge contained in a rich corpus of local terms mapped to LOINC (Logical Observation Identifiers Names and Codes) could be leveraged to help map local terms from other institutions.

Methods: We developed two models to test our hypothesis. The first based on supervised machine learning was created using Apache's OpenNLP Maxent and the second based on information retrieval was created using Apache's Lucene. The models were validated by a random subsampling method that was repeated 20 times and that used 80/20 splits for training and testing, respectively. We also evaluated the performance of these models on all laboratory terms from three test institutions.

Results: For the 20 iterations used for validation of our 80/20 splits Maxent and Lucene ranked the correct LOINC code first for between 70.5% and 71.4% and between 63.7% and 65.0% of local terms, respectively. For all laboratory terms from the three test institutions Maxent ranked the correct LOINC code first for between 73.5% and 84.6% (mean 78.9%) of local terms, whereas Lucene's performance was between 66.5% and 76.6% (mean 71.9%). Using a cut-off score of 0.46 Maxent always ranked the correct LOINC code first for over 57% of local terms.

Conclusions: This study showed that a rich corpus of local terms mapped to LOINC contains collective knowledge that can help map terms from other institutions. Using freely available software tools, we developed a data-driven automated approach that operates on term descriptions from existing mappings in the corpus. Accurate and efficient automated mapping methods can help to accelerate adoption of vocabulary standards and promote widespread health information exchange.

Keywords: LOINC; automated mapping; health information exchange; information retrieval; local laboratory tests; supervised machine learning.

PubMed Disclaimer

Figures

Figure 1
Figure 1
Growth of unique LOINC codes mapped to local terms and unique words in local term descriptions with an incrementally growing corpus.
Figure 2
Figure 2
Results of 20 iterations of repeated random subsampling validation showing the percentage of test terms with manually mapped LOINC codes ranked first (top 1) and among the top 5 by Maxent and Lucene.
Figure 3
Figure 3
Rank of correct LOINC codes and their Maxent score for local laboratory terms from three test institutions.
Figure 4
Figure 4
Performance of Maxent and Lucene when applying the test set against an incrementally growing corpus.

References

    1. Chaudhry B, Wang J, Wu S, et al. Systematic review: impact of health information technology on quality, efficiency, and costs of medical care. Ann Intern Med 2006;144 - PubMed
    1. Smith PC, Araya-Guerra R, Bublitz C, et al. Missing clinical information during primary care visits. JAMA 2005;293:565–71 - PubMed
    1. Finnell JT, Overhage JM, Grannis S. All health care is not local: an evaluation of the distribution of emergency department care delivered in Indiana. AMIA Annu Symposium Proceedings 2011;2011:409–16. - PMC - PubMed
    1. 111th Congress of the United States of America. American Recovery and Reinvestment Act of 2009.
    1. McDonald CJ, Huff SM, Suico JG, et al. LOINC, a universal standard for identifying laboratory observations: a 5-year update. Clin Chem 2003;49:624–33 - PubMed

Publication types

MeSH terms