Demonstration-based learning for few-shot biomedical named entity recognition under machine reading comprehension

Leilei Su¹, Jian Chen², Yifan Peng³, Cong Sun⁴

Affiliations

¹ Department of Mathematics, Hainan University, Haikou 570228, China.
² Department of Data Science and Big Data Technology, Hainan University, Haikou 570228, China.
³ Department of Population Health Sciences, Weill Cornell Medicine, Cornell University, New York 10065, USA.
⁴ Department of Data Science and Big Data Technology, Hainan University, Haikou 570228, China; Department of Population Health Sciences, Weill Cornell Medicine, Cornell University, New York 10065, USA. Electronic address: cs2565@cornell.edu.

PMID: 39490610
DOI: 10.1016/j.jbi.2024.104739

Demonstration-based learning for few-shot biomedical named entity recognition under machine reading comprehension

Leilei Su et al. J Biomed Inform. 2024 Nov.

. 2024 Nov:159:104739.

doi: 10.1016/j.jbi.2024.104739. Epub 2024 Oct 25.

Authors

Leilei Su¹, Jian Chen², Yifan Peng³, Cong Sun⁴

Affiliations

¹ Department of Mathematics, Hainan University, Haikou 570228, China.
² Department of Data Science and Big Data Technology, Hainan University, Haikou 570228, China.
³ Department of Population Health Sciences, Weill Cornell Medicine, Cornell University, New York 10065, USA.
⁴ Department of Data Science and Big Data Technology, Hainan University, Haikou 570228, China; Department of Population Health Sciences, Weill Cornell Medicine, Cornell University, New York 10065, USA. Electronic address: cs2565@cornell.edu.

PMID: 39490610
DOI: 10.1016/j.jbi.2024.104739

Abstract

Objective: Although deep learning techniques have shown significant achievements, they frequently depend on extensive amounts of hand-labeled data and tend to perform inadequately in few-shot scenarios. The objective of this study is to devise a strategy that can improve the model's capability to recognize biomedical entities in scenarios of few-shot learning.

Methods: By redefining biomedical named entity recognition (BioNER) as a machine reading comprehension (MRC) problem, we propose a demonstration-based learning method to address few-shot BioNER, which involves constructing appropriate task demonstrations. In assessing our proposed method, we compared the proposed method with existing advanced methods using six benchmark datasets, including BC4CHEMD, BC5CDR-Chemical, BC5CDR-Disease, NCBI-Disease, BC2GM, and JNLPBA.

Results: We examined the models' efficacy by reporting F1 scores from both the 25-shot and 50-shot learning experiments. In 25-shot learning, we observed 1.1% improvements in the average F1 scores compared to the baseline method, reaching 61.7%, 84.1%, 69.1%, 70.1%, 50.6%, and 59.9% on six datasets, respectively. In 50-shot learning, we further improved the average F1 scores by 1.0% compared to the baseline method, reaching 73.1%, 86.8%, 76.1%, 75.6%, 61.7%, and 65.4%, respectively.

Conclusion: We reported that in the realm of few-shot learning BioNER, MRC-based language models are much more proficient in recognizing biomedical entities compared to the sequence labeling approach. Furthermore, our MRC-language models can compete successfully with fully-supervised learning methodologies that rely heavily on the availability of abundant annotated data. These results highlight possible pathways for future advancements in few-shot BioNER methodologies.

Keywords: Biomedical named entity recognition; Demonstration-based learning; Few-shot learning; Machine reading comprehension; Prompt-based learning.

PubMed Disclaimer

Conflict of interest statement

Declaration of competing interest The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

References

MeSH terms

Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions

LinkOut - more resources

Full Text Sources
- Elsevier Science

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Demonstration-based learning for few-shot biomedical named entity recognition under machine reading comprehension

Affiliations

Demonstration-based learning for few-shot biomedical named entity recognition under machine reading comprehension

Authors

Affiliations

Abstract

Conflict of interest statement

References

MeSH terms

LinkOut - more resources

Full Text Sources