Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Case Reports
. 2023 Jan 26;23(1):20.
doi: 10.1186/s12911-023-02117-3.

Entity and relation extraction from clinical case reports of COVID-19: a natural language processing approach

Affiliations
Case Reports

Entity and relation extraction from clinical case reports of COVID-19: a natural language processing approach

Shaina Raza et al. BMC Med Inform Decis Mak. .

Abstract

Background: Extracting relevant information about infectious diseases is an essential task. However, a significant obstacle in supporting public health research is the lack of methods for effectively mining large amounts of health data.

Objective: This study aims to use natural language processing (NLP) to extract the key information (clinical factors, social determinants of health) from published cases in the literature.

Methods: The proposed framework integrates a data layer for preparing a data cohort from clinical case reports; an NLP layer to find the clinical and demographic-named entities and relations in the texts; and an evaluation layer for benchmarking performance and analysis. The focus of this study is to extract valuable information from COVID-19 case reports.

Results: The named entity recognition implementation in the NLP layer achieves a performance gain of about 1-3% compared to benchmark methods. Furthermore, even without extensive data labeling, the relation extraction method outperforms benchmark methods in terms of accuracy (by 1-8% better). A thorough examination reveals the disease's presence and symptoms prevalence in patients.

Conclusions: A similar approach can be generalized to other infectious diseases. It is worthwhile to use prior knowledge acquired through transfer learning when researching other infectious diseases.

Keywords: Artificial intelligence; COVID-19; Data cohort; Named entity; Natural language processing; Relation extraction; Transfer learning.

PubMed Disclaimer

Conflict of interest statement

The authors declare that they have no competing interests.

Figures

Fig. 1
Fig. 1
Proposed framework for pandemic surveillance
Fig. 2
Fig. 2
Proposed model for named entity recognition
Fig. 3
Fig. 3
Zero-shot learning-based transformer model for relation extraction
Fig. 4
Fig. 4
Frequency of COVID-19 symptoms
Fig. 5
Fig. 5
Distribution of most frequent medical complications in the population
Fig. 6
Fig. 6
Condition prevalence related to different disease syndromes (Cerebrovascular, Cardiovascular, Pulmonary, Psychological). Bars represent the number of respondents who experienced each symptom at any point in their illness
Fig. 7
Fig. 7
COVID-19 hospitalization by race
Fig. 8
Fig. 8
Temporal relations in a text
Fig. 9
Fig. 9
Adverse drug effect with Paxlovid drug

Similar articles

Cited by

References

    1. Ourworldindata.org. COVID-19 Data Explorer. Our world in data. 2022.
    1. Flor LS, Friedman J, Spencer CN, Cagney J, Arrieta A, Herbert ME, et al. Quantifying the effects of the COVID-19 pandemic on gender equality on health, social, and economic indicators: a comprehensive review of data from March, 2020, to September, 2021. Lancet. 2022. - PMC - PubMed
    1. Baena-Diéz JM, Barroso M, Cordeiro-Coelho SI, Diáz JL, Grau M. Impact of COVID-19 outbreak by income: hitting hardest the most deprived. J Public Heal. 2020;42:698–703. doi: 10.1093/pubmed/fdaa136. - DOI - PMC - PubMed
    1. Kaye AD, Okeagu CN, Pham AD, Silva RA, Hurley JJ, Arron BL, et al. Economic impact of COVID-19 pandemic on healthcare facilities and systems: International perspectives. Best Pract Res Clin Anaesthesiol. 2021;35:293–306. doi: 10.1016/j.bpa.2020.11.009. - DOI - PMC - PubMed
    1. Williamson EJ, Walker AJ, Bhaskaran K, Bacon S, Bates C, Morton CE, et al. Factors associated with COVID-19-related death using OpenSAFELY. Nature. 2020;584:430–436. doi: 10.1038/s41586-020-2521-4. - DOI - PMC - PubMed

Publication types

Grants and funding