Tackling the small data problem in medical image classification with artificial intelligence: a systematic review

Stefano Piffer^{1

2}, Leonardo Ubaldi^{1

2}, Sabina Tangaro^{3

4}, Alessandra Retico⁵, Cinzia Talamonti^{1

2}

Affiliations

¹ Department of Experimental and Clinical Biomedical Sciences, University of Florence, Florence, Italy.
² National Institute for Nuclear Physics (INFN), Florence Division, Florence, Italy.
³ Department of Soil, Plant and Food Sciences, University of Bari Aldo Moro, Bari, Italy.
⁴ INFN, Bari Division, Bari, Italy.
⁵ INFN, Pisa Division, Pisa, Italy.

PMID: 39655846
DOI: 10.1088/2516-1091/ad525b

Tackling the small data problem in medical image classification with artificial intelligence: a systematic review

Stefano Piffer et al. Prog Biomed Eng (Bristol). 2024.

. 2024 Jun 17;6(3).

doi: 10.1088/2516-1091/ad525b.

Authors

Stefano Piffer^{1

2}, Leonardo Ubaldi^{1

2}, Sabina Tangaro^{3

4}, Alessandra Retico⁵, Cinzia Talamonti^{1

2}

Affiliations

¹ Department of Experimental and Clinical Biomedical Sciences, University of Florence, Florence, Italy.
² National Institute for Nuclear Physics (INFN), Florence Division, Florence, Italy.
³ Department of Soil, Plant and Food Sciences, University of Bari Aldo Moro, Bari, Italy.
⁴ INFN, Bari Division, Bari, Italy.
⁵ INFN, Pisa Division, Pisa, Italy.

PMID: 39655846
DOI: 10.1088/2516-1091/ad525b

Abstract

Though medical imaging has seen a growing interest in AI research, training models require a large amount of data. In this domain, there are limited sets of data available as collecting new data is either not feasible or requires burdensome resources. Researchers are facing with the problem of small datasets and have to apply tricks to fight overfitting. 147 peer-reviewed articles were retrieved from PubMed, published in English, up until 31 July 2022 and articles were assessed by two independent reviewers. We followed the Preferred Reporting Items for Systematic reviews and Meta-Analyse (PRISMA) guidelines for the paper selection and 77 studies were regarded as eligible for the scope of this review. Adherence to reporting standards was assessed by using TRIPOD statement (transparent reporting of a multivariable prediction model for individual prognosis or diagnosis). To solve the small data issue transfer learning technique, basic data augmentation and generative adversarial network were applied in 75%, 69% and 14% of cases, respectively. More than 60% of the authors performed a binary classification given the data scarcity and the difficulty of the tasks. Concerning generalizability, only four studies explicitly stated an external validation of the developed model was carried out. Full access to all datasets and code was severely limited (unavailable in more than 80% of studies). Adherence to reporting standards was suboptimal (<50% adherence for 13 of 37 TRIPOD items). The goal of this review is to provide a comprehensive survey of recent advancements in dealing with small medical images samples size. Transparency and improve quality in publications as well as follow existing reporting standards are also supported.

Keywords: artificial intelligence; classification; data augmentation; medical imaging; small data; transfer learning.

Creative Commons Attribution license.

PubMed Disclaimer

References

Publication types

Actions
Actions

MeSH terms

Actions
Actions
Actions
Actions

LinkOut - more resources

Medical
- MedlinePlus Health Information

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Tackling the small data problem in medical image classification with artificial intelligence: a systematic review

Affiliations

Tackling the small data problem in medical image classification with artificial intelligence: a systematic review

Authors

Affiliations

Abstract

References

Publication types

MeSH terms

LinkOut - more resources

Medical