Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Review
. 2023;65(2):463-516.
doi: 10.1007/s10115-022-01779-1. Epub 2022 Nov 8.

Information extraction from electronic medical documents: state of the art and future research directions

Affiliations
Review

Information extraction from electronic medical documents: state of the art and future research directions

Mohamed Yassine Landolsi et al. Knowl Inf Syst. 2023.

Abstract

In the medical field, a doctor must have a comprehensive knowledge by reading and writing narrative documents, and he is responsible for every decision he takes for patients. Unfortunately, it is very tiring to read all necessary information about drugs, diseases and patients due to the large amount of documents that are increasing every day. Consequently, so many medical errors can happen and even kill people. Likewise, there is such an important field that can handle this problem, which is the information extraction. There are several important tasks in this field to extract the important and desired information from unstructured text written in natural language. The main principal tasks are named entity recognition and relation extraction since they can structure the text by extracting the relevant information. However, in order to treat the narrative text we should use natural language processing techniques to extract useful information and features. In our paper, we introduce and discuss the several techniques and solutions used in these tasks. Furthermore, we outline the challenges in information extraction from medical documents. In our knowledge, this is the most comprehensive survey in the literature with an experimental analysis and a suggestion for some uncovered directions.

Keywords: Electronic medical records; Information extraction; Medical named entities recognition; Medical relation extraction; Section detection.

PubMed Disclaimer

Conflict of interest statement

Conflict of interestThe authors declare that they have no competing interests.

Figures

Fig. 1
Fig. 1
The general steps of the EMR data processing

References

    1. Abacha AB, Zweigenbaum P (2011) Medical entity recognition: a comparaison of semantic and statistical methods. In: Proceedings of BioNLP 2011 workshop, pp 56–64
    1. Aich S, Sain M, Park J, Choi KW, Kim HC (2017) A text mining approach to identify the relationship between gait-parkinson’s disease (pd) from pd based research articles. In: 2017 international conference on inventive computing and informatics (ICICI), IEEE, pp 481–485
    1. Akbik A, Bergmann T, Blythe D, Rasul K, Schweter S, Vollgraf R (2019) Flair: an easy-to-use framework for state-of-the-art nlp. In: Proceedings of the 2019 conference of the north american chapter of the association for computational linguistics (Demonstrations), pp 54–59
    1. Al-Dafas M, Albujeer A, Hussien SA, Ibrahim RK. On the adaption of data mining technology to categorize cancer diseases. Int J Artif Intell Inform. 2022;3(2):80–91.
    1. Alex B, Grover C, Tobin R, Sudlow C, Mair G, Whiteley W. Text mining brain imaging reports. J Biomed Semant. 2019;10(1):1–11. - PMC - PubMed

LinkOut - more resources