Review

. 2023 Aug:144:104458.

doi: 10.1016/j.jbi.2023.104458. Epub 2023 Jul 23.

Few-shot learning for medical text: A review of advances, trends, and opportunities

Yao Ge¹, Yuting Guo¹, Sudeshna Das¹, Mohammed Ali Al-Garadi², Abeed Sarker³

Affiliations

¹ Department of Biomedical Informatics, School of Medicine, Emory University, Atlanta, GA, United States of America.
² Department of Biomedical Informatics, Vanderbilt University Medical Center, Vanderbilt University, Nashville, TN, United States of America.
³ Department of Biomedical Informatics, School of Medicine, Emory University, Atlanta, GA, United States of America; Department of Biomedical Engineering, Georgia Institute of Technology and Emory University, Atlanta, GA, United States of America. Electronic address: abeed@dbmi.emory.edu.

PMID: 37488023
PMCID: PMC10940971
DOI: 10.1016/j.jbi.2023.104458

Review

Few-shot learning for medical text: A review of advances, trends, and opportunities

Yao Ge et al. J Biomed Inform. 2023 Aug.

. 2023 Aug:144:104458.

doi: 10.1016/j.jbi.2023.104458. Epub 2023 Jul 23.

Authors

Yao Ge¹, Yuting Guo¹, Sudeshna Das¹, Mohammed Ali Al-Garadi², Abeed Sarker³

Affiliations

¹ Department of Biomedical Informatics, School of Medicine, Emory University, Atlanta, GA, United States of America.
² Department of Biomedical Informatics, Vanderbilt University Medical Center, Vanderbilt University, Nashville, TN, United States of America.
³ Department of Biomedical Informatics, School of Medicine, Emory University, Atlanta, GA, United States of America; Department of Biomedical Engineering, Georgia Institute of Technology and Emory University, Atlanta, GA, United States of America. Electronic address: abeed@dbmi.emory.edu.

PMID: 37488023
PMCID: PMC10940971
DOI: 10.1016/j.jbi.2023.104458

Abstract

Background: Few-shot learning (FSL) is a class of machine learning methods that require small numbers of labeled instances for training. With many medical topics having limited annotated text-based data in practical settings, FSL-based natural language processing (NLP) holds substantial promise. We aimed to conduct a review to explore the current state of FSL methods for medical NLP.

Methods: We searched for articles published between January 2016 and October 2022 using PubMed/Medline, Embase, ACL Anthology, and IEEE Xplore Digital Library. We also searched the preprint servers (e.g., arXiv, medRxiv, and bioRxiv) via Google Scholar to identify the latest relevant methods. We included all articles that involved FSL and any form of medical text. We abstracted articles based on the data source, target task, training set size, primary method(s)/approach(es), and evaluation metric(s).

Results: Fifty-one articles met our inclusion criteria-all published after 2018, and most since 2020 (42/51; 82%). Concept extraction/named entity recognition was the most frequently addressed task (21/51; 41%), followed by text classification (16/51; 31%). Thirty-two (61%) articles reconstructed existing datasets to fit few-shot scenarios, and MIMIC-III was the most frequently used dataset (10/51; 20%). 77% of the articles attempted to incorporate prior knowledge to augment the small datasets available for training. Common methods included FSL with attention mechanisms (20/51; 39%), prototypical networks (11/51; 22%), meta-learning (7/51; 14%), and prompt-based learning methods, the latter being particularly popular since 2021. Benchmarking experiments demonstrated relative underperformance of FSL methods on biomedical NLP tasks.

Conclusion: Despite the potential for FSL in biomedical NLP, progress has been limited. This may be attributed to the rarity of specialized data, lack of standardized evaluation criteria, and the underperformance of FSL methods on biomedical topics. The creation of publicly-available specialized datasets for biomedical FSL may aid method development by facilitating comparative analyses.

Keywords: Biomedical informatics; Few-shot learning; Machine learning; Natural language processing.

PubMed Disclaimer

Conflict of interest statement

Declaration of competing interest The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Figures

**Figure 1:**
Architecture for metric learning: the support set is used to generate embeddings using the embedding function f₁. The embeddings of the query set, also generated using f₁, are compared with the support set embeddings using a suitable distance function f₂. Depending upon the task, the label of the most similar (or dissimilar) support set samples is assigned to the query set samples.

**Figure 2:**
Architecture for matching networks: a small support set contains some instances with their labels (one instance per label in the figure). Given a query, the goal is to calculate a value that indicates if the instance is an example of a given class. For a similarity metric, two embedding functions, f() and g(), need to take similarity based on the feature space. The function f(), which is a neural network, is applied first, and then the embedding function g() is applied to each instance to process the kernel for each support set. (Note: example uses the DASH 2020 Drug Data [18]).

**Figure 3:**
Architecture for prototypical networks: a class’s prototype is the mean of its support set in the embedding space. Given a query, its distance to each class’s prototype is computed to decide its label. (Note: example uses the DASH 2020 Drug Data [18]).

**Figure 4:**
Architecture for transfer learning: in the context of few-shot learning, transfer learning involves using a base task to train the base classifier (f()). In this example, the base classifier is trained on the task of addiction/recovery detection (text classification). The learned embeddings from the base classifier are used to produce embeddings with *data-level prior knowledge*. The embeddings are used to train the target classifier (g()) on a different, but related text classification task: illicit drug detection.

**Figure 5:**
Architecture for meta-learning: each task mimics the few-shot scenario and can be completely non-overlapping. Support sets are used to train; query sets are used to evaluate the model. In this example, several text classification tasks on different datasets (and label sets) are used to train the meta-learner. Finally, the test task (medical domain) is used for generalizing the meta-learner to the test task.

**Figure 6:**
PRISMA flow diagram depicting the number of articles at each stage of collection and the filtering process.

**Figure 7:**
F₁-scores for four FSL NER models on five different medical texts datasets. Further details are reported in a recent publication [119].

See this image and copyright information in PMC

References

1. Sung F, Yang Y, Zhang L, Xiang T, Torr PH, and Hospedales TM, “Learning to compare: Relation network for few-shot learning,” Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1199–1208, 2018. eprint: https://openaccess.thecvf.com/content_cvpr_2018/papers/Sung_Learning_to_....
1. Snell J, Swersky K, and Zemel RS, “Prototypical networks for few-shot learning,” Advances in Neural Information Processing Systems, vol. 30, 2017. [Online]. Available: https://proceedings.neurips.cc/paper/2017/file/cb8da6767461f2812ae4290ea....
1. Lake BM, Salakhutdinov R, and Tenenbaum JB, “One-shot learning by inverting a compositional causal process,” Advances in Neural Information Processing Systems, vol. 26, 2013. eprint: https://papers.nips.cc/paper/2013/file/52292e0c763fd027c6eba6b8f494d2eb-....
1. Dong N. and Xing EP, “Few-Shot Semantic Segmentation with Prototype Learning,” in British Machine Vision Conference (BMVC), vol. 3, 2018. eprint: http://bmvc2018.org/contents/papers/0255.pdf.
1. Li W, Wang L, Xu J, Huo J, Gao Y, and Luo J, “Revisiting Local Descriptor based Image-to-Class Measure for Few-shot Learning,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2019, pp. 7260–7268. eprint: https://openaccess.thecvf.com/content_CVPR_2019/papers/Li_Revisiting_Loc....

Publication types

Actions
Actions

MeSH terms

Actions
Actions
Actions
Actions
Actions

Grants and funding

R01 DA057599/DA/NIDA NIH HHS/United States

LinkOut - more resources

Full Text Sources
- Elsevier Science
- PubMed Central
Miscellaneous
- NCI CPTAC Assay Portal

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Few-shot learning for medical text: A review of advances, trends, and opportunities

Affiliations

Few-shot learning for medical text: A review of advances, trends, and opportunities

Authors

Affiliations

Abstract

Conflict of interest statement

Figures

References

Publication types

MeSH terms

Grants and funding

LinkOut - more resources

Full Text Sources

Miscellaneous