Biomedical named entity recognition using BERT in the machine reading comprehension framework
- PMID: 33965638
- DOI: 10.1016/j.jbi.2021.103799
Biomedical named entity recognition using BERT in the machine reading comprehension framework
Abstract
Recognition of biomedical entities from literature is a challenging research focus, which is the foundation for extracting a large amount of biomedical knowledge existing in unstructured texts into structured formats. Using the sequence labeling framework to implement biomedical named entity recognition (BioNER) is currently a conventional method. This method, however, often cannot take full advantage of the semantic information in the dataset, and the performance is not always satisfactory. In this work, instead of treating the BioNER task as a sequence labeling problem, we formulate it as a machine reading comprehension (MRC) problem. This formulation can introduce more prior knowledge utilizing well-designed queries, and no longer need decoding processes such as conditional random fields (CRF). We conduct experiments on six BioNER datasets, and the experimental results demonstrate the effectiveness of our method. Our method achieves state-of-the-art (SOTA) performance on the BC4CHEMD, BC5CDR-Chem, BC5CDR-Disease, NCBI-Disease, BC2GM and JNLPBA datasets, achieving F1-scores of 92.92%, 94.19%, 87.83%, 90.04%, 85.48% and 78.93%, respectively.
Keywords: MRC; Machine reading comprehension; NER; Named entity recognition; Text mining.
Copyright © 2021 Elsevier Inc. All rights reserved.
Similar articles
-
BioByGANS: biomedical named entity recognition by fusing contextual and syntactic features through graph attention network in node classification framework.BMC Bioinformatics. 2022 Nov 22;23(1):501. doi: 10.1186/s12859-022-05051-9. BMC Bioinformatics. 2022. PMID: 36418937 Free PMC article.
-
Demonstration-based learning for few-shot biomedical named entity recognition under machine reading comprehension.J Biomed Inform. 2024 Nov;159:104739. doi: 10.1016/j.jbi.2024.104739. Epub 2024 Oct 25. J Biomed Inform. 2024. PMID: 39490610
-
DTranNER: biomedical named entity recognition with deep learning-based label-label transition model.BMC Bioinformatics. 2020 Feb 11;21(1):53. doi: 10.1186/s12859-020-3393-1. BMC Bioinformatics. 2020. PMID: 32046638 Free PMC article.
-
A prefix and attention map discrimination fusion guided attention for biomedical named entity recognition.BMC Bioinformatics. 2023 Feb 8;24(1):42. doi: 10.1186/s12859-023-05172-9. BMC Bioinformatics. 2023. PMID: 36755230 Free PMC article.
-
Recent advances of automated methods for searching and extracting genomic variant information from biomedical literature.Brief Bioinform. 2021 May 20;22(3):bbaa142. doi: 10.1093/bib/bbaa142. Brief Bioinform. 2021. PMID: 32770181 Free PMC article. Review.
Cited by
-
Large Language Models in Biomedical and Health Informatics: A Review with Bibliometric Analysis.J Healthc Inform Res. 2024 Sep 14;8(4):658-711. doi: 10.1007/s41666-024-00171-8. eCollection 2024 Dec. J Healthc Inform Res. 2024. PMID: 39463859
-
MetaboListem and TABoLiSTM: Two Deep Learning Algorithms for Metabolite Named Entity Recognition.Metabolites. 2022 Mar 22;12(4):276. doi: 10.3390/metabo12040276. Metabolites. 2022. PMID: 35448463 Free PMC article.
-
BioBBC: a multi-feature model that enhances the detection of biomedical entities.Sci Rep. 2024 Apr 2;14(1):7697. doi: 10.1038/s41598-024-58334-x. Sci Rep. 2024. PMID: 38565624 Free PMC article.
-
Vocabulary Matters: An Annotation Pipeline and Four Deep Learning Algorithms for Enzyme Named Entity Recognition.J Proteome Res. 2024 Jun 7;23(6):1915-1925. doi: 10.1021/acs.jproteome.3c00367. Epub 2024 May 11. J Proteome Res. 2024. PMID: 38733346 Free PMC article.
-
Fine-grained spatial information extraction in radiology as two-turn question answering.Int J Med Inform. 2022 Jun;158:104628. doi: 10.1016/j.ijmedinf.2021.104628. Epub 2021 Nov 6. Int J Med Inform. 2022. PMID: 34839119 Free PMC article.
Publication types
MeSH terms
LinkOut - more resources
Full Text Sources
Other Literature Sources