Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2025 May 13;16(1):4449.
doi: 10.1038/s41467-025-59478-8.

Improving AI models for rare thyroid cancer subtype by text guided diffusion models

Affiliations

Improving AI models for rare thyroid cancer subtype by text guided diffusion models

Fang Dai et al. Nat Commun. .

Abstract

Artificial intelligence applications in oncology imaging often struggle with diagnosing rare tumors. We identify significant gaps in detecting uncommon thyroid cancer types with ultrasound, where scarce data leads to frequent misdiagnosis. Traditional augmentation strategies do not capture the unique disease variations, hindering model training and performance. To overcome this, we propose a text-driven generative method that fuses clinical insights with image generation, producing synthetic samples that realistically reflect rare subtypes. In rigorous evaluations, our approach achieves substantial gains in diagnostic metrics, surpasses existing methods in authenticity and diversity measures, and generalizes effectively to other private and public datasets with various rare cancers. In this work, we demonstrate that text-guided image augmentation substantially enhances model accuracy and robustness for rare tumor detection, offering a promising avenue for more reliable and widespread clinical adoption.

PubMed Disclaimer

Conflict of interest statement

Competing interests: The authors declare no competing interests.

Figures

Fig. 1
Fig. 1. Training and Evaluation Process of Tiger Model.
The Tiger Model is trained and constructed based on the differences in disease subtype features, utilizing disease knowledge(prompt). Through the model’s extraction of commonality and differences in the feature domain, it achieves the generation of true diversity of features for rare subtypes. The training data of the model are ultrasonic images and ultrasonic reports. Tiger generates augmented datasets and merges them with real data for training classification models. The realism and diversity of the images are then tested by professionals and evaluated using computer vision indices.
Fig. 2
Fig. 2. Tiger Model Architecture Design and Applications.
The Tiger Model is trained and constructed based on the differences in disease subtype features, utilizing disease knowledge(prompt). The model identifies commonality and differences within the feature domain, allowing it to generate diverse feature combinations reflecting true rare subtypes. The training data of the model encompass ultrasonic images and ultrasonic report text. Tiger generates augmented datasets and merges them with real data for training classification models. The realism and diversity of the images are then tested by professionals and evaluated using computer vision indices.
Fig. 3
Fig. 3. Design and results of the first two Turing tests.
a Turing test 1: Three doctors judged each picture to be real or fake. The images are randomly selected from the Turing Test Set. b In Turing test 1, compared to other generative methods, the Tiger Model (Tiger-F) received the closest scores to real images. c Turing test 2: The expert chooses the one that corresponds to the text from four pictures. The four choices include one correct choice that corresponds to the sentence, and three different images randomly chosen from the Turing Test Set. d In Turing test 2, compared to other generative methods, images generated by the Tiger Model received the highest scores from medical experts in selecting the correct ones. Source data are provided as a Source Data file.
Fig. 4
Fig. 4. Design and results of the Turing test 3.
a Presents the assessment of 50 doctors based on 10 prompts regarding the correspondence of features between real images and Tiger-F generated images. Annotations illustrate doctors’ associations of feature descriptions with image features in ultrasound images. b Compiles the results of correct feature judgments by doctors and the CLIP model. Both the doctors and the CLIP model exhibit similar proficiency in feature judgment between generated images and real images. c The results of three alternative models (Stable Diffusion, Imagen, and Tiger-N) evaluated using physician assessments and the CLIP evaluation method. The performance of these models is inferior compared to that of Tiger-F in Fig. 5b. Source data are provided as a Source Data file.
Fig. 5
Fig. 5. Effects of the number of generated images.
a Comparative evaluation of the effect of downstream tasks based on different amplification ratios of different amplification methods in benign-malignant binary thyroid cancer prediction tasks. The results reveal performance limitations of other methods, possibly due to insufficient diversity in the detailed features of generated samples, while the Tiger Model shows substantial advantages by continuously improving performance. b Tiger data amplification and the addition of an equivalent amount of real image amplification were compared with the results of the downstream classification of rare subtypes. The FTC and MTC malignant tumor classification tasks showed that the model trained on the Tiger Model enhancement data was significantly better than the unamplified model, and the results were similar to the real image amplification. The AUC results are presented as mean values, with error bars representing 95% confidence intervals derived from n = 50 experimental replicates for each task setting. In each replicate trial, the basic real images (x) were selected through bootstrap sampling from the real image set. Source data are provided as a Source Data file.

Similar articles

Cited by

References

    1. Ricci Lara, M. A., Echeveste, R. & Ferrante, E. Addressing fairness in artificial intelligence for medical imaging. Nat. Commun.13, 4581 (2022). - PMC - PubMed
    1. Rajpurkar, P., Chen, E., Banerjee, O. & Topol, E. J. AI in health and medicine. Nat. Med.28, 31–38 (2022). - PubMed
    1. Willemink, M. J. et al. Preparing medical imaging data for machine learning. Radiology295, 4–15 (2020). - PMC - PubMed
    1. Bria, A., Marrocco, C. & Tortorella, F. Addressing class imbalance in deep learning for small lesion detection on medical images. Comput. Biol. Med.120, 103735 (2020). - PubMed
    1. Stark, Z. & Scott, R. H. Genomic newborn screening for rare diseases. Nat. Rev. Genet.24, 755–766 (2023). - PubMed

MeSH terms

LinkOut - more resources