Demonstration-based learning for few-shot biomedical named entity recognition under machine reading comprehension
- PMID: 39490610
- DOI: 10.1016/j.jbi.2024.104739
Demonstration-based learning for few-shot biomedical named entity recognition under machine reading comprehension
Abstract
Objective: Although deep learning techniques have shown significant achievements, they frequently depend on extensive amounts of hand-labeled data and tend to perform inadequately in few-shot scenarios. The objective of this study is to devise a strategy that can improve the model's capability to recognize biomedical entities in scenarios of few-shot learning.
Methods: By redefining biomedical named entity recognition (BioNER) as a machine reading comprehension (MRC) problem, we propose a demonstration-based learning method to address few-shot BioNER, which involves constructing appropriate task demonstrations. In assessing our proposed method, we compared the proposed method with existing advanced methods using six benchmark datasets, including BC4CHEMD, BC5CDR-Chemical, BC5CDR-Disease, NCBI-Disease, BC2GM, and JNLPBA.
Results: We examined the models' efficacy by reporting F1 scores from both the 25-shot and 50-shot learning experiments. In 25-shot learning, we observed 1.1% improvements in the average F1 scores compared to the baseline method, reaching 61.7%, 84.1%, 69.1%, 70.1%, 50.6%, and 59.9% on six datasets, respectively. In 50-shot learning, we further improved the average F1 scores by 1.0% compared to the baseline method, reaching 73.1%, 86.8%, 76.1%, 75.6%, 61.7%, and 65.4%, respectively.
Conclusion: We reported that in the realm of few-shot learning BioNER, MRC-based language models are much more proficient in recognizing biomedical entities compared to the sequence labeling approach. Furthermore, our MRC-language models can compete successfully with fully-supervised learning methodologies that rely heavily on the availability of abundant annotated data. These results highlight possible pathways for future advancements in few-shot BioNER methodologies.
Keywords: Biomedical named entity recognition; Demonstration-based learning; Few-shot learning; Machine reading comprehension; Prompt-based learning.
Copyright © 2024 Elsevier Inc. All rights reserved.
Conflict of interest statement
Declaration of competing interest The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.
Similar articles
-
Augmenting biomedical named entity recognition with general-domain resources.J Biomed Inform. 2024 Nov;159:104731. doi: 10.1016/j.jbi.2024.104731. Epub 2024 Oct 4. J Biomed Inform. 2024. PMID: 39368529
-
Advancing entity recognition in biomedicine via instruction tuning of large language models.Bioinformatics. 2024 Mar 29;40(4):btae163. doi: 10.1093/bioinformatics/btae163. Bioinformatics. 2024. PMID: 38514400 Free PMC article.
-
Resource-efficient instruction tuning of large language models for biomedical named entity recognition.J Biomed Inform. 2025 Aug 21;170:104896. doi: 10.1016/j.jbi.2025.104896. Online ahead of print. J Biomed Inform. 2025. PMID: 40849052
-
Extracting adverse drug events from clinical Notes: A systematic review of approaches used.J Biomed Inform. 2024 Mar;151:104603. doi: 10.1016/j.jbi.2024.104603. Epub 2024 Feb 6. J Biomed Inform. 2024. PMID: 38331081
-
Artificial intelligence-powered pharmacovigilance: A review of machine and deep learning in clinical text-based adverse drug event detection for benchmark datasets.J Biomed Inform. 2024 Apr;152:104621. doi: 10.1016/j.jbi.2024.104621. Epub 2024 Mar 5. J Biomed Inform. 2024. PMID: 38447600 Review.
References
MeSH terms
LinkOut - more resources
Full Text Sources