Entity and relation extraction from clinical case reports of COVID-19: a natural language processing approach
- PMID: 36703154
- PMCID: PMC9879259
- DOI: 10.1186/s12911-023-02117-3
Entity and relation extraction from clinical case reports of COVID-19: a natural language processing approach
Abstract
Background: Extracting relevant information about infectious diseases is an essential task. However, a significant obstacle in supporting public health research is the lack of methods for effectively mining large amounts of health data.
Objective: This study aims to use natural language processing (NLP) to extract the key information (clinical factors, social determinants of health) from published cases in the literature.
Methods: The proposed framework integrates a data layer for preparing a data cohort from clinical case reports; an NLP layer to find the clinical and demographic-named entities and relations in the texts; and an evaluation layer for benchmarking performance and analysis. The focus of this study is to extract valuable information from COVID-19 case reports.
Results: The named entity recognition implementation in the NLP layer achieves a performance gain of about 1-3% compared to benchmark methods. Furthermore, even without extensive data labeling, the relation extraction method outperforms benchmark methods in terms of accuracy (by 1-8% better). A thorough examination reveals the disease's presence and symptoms prevalence in patients.
Conclusions: A similar approach can be generalized to other infectious diseases. It is worthwhile to use prior knowledge acquired through transfer learning when researching other infectious diseases.
Keywords: Artificial intelligence; COVID-19; Data cohort; Named entity; Natural language processing; Relation extraction; Transfer learning.
© 2023. Crown.
Conflict of interest statement
The authors declare that they have no competing interests.
Figures









Similar articles
-
Constructing a disease database and using natural language processing to capture and standardize free text clinical information.Sci Rep. 2023 May 26;13(1):8591. doi: 10.1038/s41598-023-35482-0. Sci Rep. 2023. PMID: 37237101 Free PMC article.
-
Extracting entities with attributes in clinical text via joint deep learning.J Am Med Inform Assoc. 2019 Dec 1;26(12):1584-1591. doi: 10.1093/jamia/ocz158. J Am Med Inform Assoc. 2019. PMID: 31550346 Free PMC article.
-
Entity recognition from clinical texts via recurrent neural network.BMC Med Inform Decis Mak. 2017 Jul 5;17(Suppl 2):67. doi: 10.1186/s12911-017-0468-7. BMC Med Inform Decis Mak. 2017. PMID: 28699566 Free PMC article.
-
Named Entity Recognition and Relation Detection for Biomedical Information Extraction.Front Cell Dev Biol. 2020 Aug 28;8:673. doi: 10.3389/fcell.2020.00673. eCollection 2020. Front Cell Dev Biol. 2020. PMID: 32984300 Free PMC article. Review.
-
Artificial Intelligence in Action: Addressing the COVID-19 Pandemic with Natural Language Processing.Annu Rev Biomed Data Sci. 2021 Jul 20;4:313-339. doi: 10.1146/annurev-biodatasci-021821-061045. Epub 2021 May 14. Annu Rev Biomed Data Sci. 2021. PMID: 34465169 Review.
Cited by
-
Applied artificial intelligence in dentistry: emerging data modalities and modeling approaches.Front Artif Intell. 2024 Jul 23;7:1427517. doi: 10.3389/frai.2024.1427517. eCollection 2024. Front Artif Intell. 2024. PMID: 39109324 Free PMC article. Review.
-
Exploring COVID-related relationship extraction: Contrasting data sources and analyzing misinformation.Heliyon. 2024 Feb 28;10(5):e26973. doi: 10.1016/j.heliyon.2024.e26973. eCollection 2024 Mar 15. Heliyon. 2024. PMID: 38455555 Free PMC article.
-
A framework for multi-faceted content analysis of social media chatter regarding non-medical use of prescription medications.BMC Digit Health. 2023;1:29. doi: 10.1186/s44247-023-00029-w. Epub 2023 Aug 7. BMC Digit Health. 2023. PMID: 37680768 Free PMC article.
-
Predicting 30-Day Postoperative Mortality and American Society of Anesthesiologists Physical Status Using Retrieval-Augmented Large Language Models: Development and Validation Study.J Med Internet Res. 2025 Jun 3;27:e75052. doi: 10.2196/75052. J Med Internet Res. 2025. PMID: 40460423 Free PMC article.
References
-
- Ourworldindata.org. COVID-19 Data Explorer. Our world in data. 2022.
Publication types
MeSH terms
Grants and funding
LinkOut - more resources
Full Text Sources
Medical