Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Review
. 2022 Dec 12;13(1):188.
doi: 10.1186/s13244-022-01315-3.

Enhancing cancer differentiation with synthetic MRI examinations via generative models: a systematic review

Affiliations
Review

Enhancing cancer differentiation with synthetic MRI examinations via generative models: a systematic review

Avtantil Dimitriadis et al. Insights Imaging. .

Abstract

Contemporary deep learning-based decision systems are well-known for requiring high-volume datasets in order to produce generalized, reliable, and high-performing models. However, the collection of such datasets is challenging, requiring time-consuming processes involving also expert clinicians with limited time. In addition, data collection often raises ethical and legal issues and depends on costly and invasive procedures. Deep generative models such as generative adversarial networks and variational autoencoders can capture the underlying distribution of the examined data, allowing them to create new and unique instances of samples. This study aims to shed light on generative data augmentation techniques and corresponding best practices. Through in-depth investigation, we underline the limitations and potential methodology pitfalls from critical standpoint and aim to promote open science research by identifying publicly available open-source repositories and datasets.

Keywords: Data augmentation; Generative adversarial networks; Magnetic resonance imaging; Synthetic medical images; Variational autoencoders.

PubMed Disclaimer

Conflict of interest statement

The authors declare that they have no competing interests.

Figures

Fig. 1
Fig. 1
The MRI sequences were used in the examined studies. The brain anatomical region included the most studies which justify the reason that T1 contrast-enhanced (T1ce), T2-weighted (T2w), fluid-attenuated inversion recovery (FLAIR) and T1-weighted (T1w) modalities prevail in the above bar chart. Apparent diffusion coefficient, diffusion (ADC), Ktrans, and dynamic contrast-enhanced (DCE) modalities were examined in the prostate anatomical region. The remained anatomical regions (i.e., pancreas, breast, liver) included as well the first four depicted modalities. Lastly, one study which concerned the brain region examined amide proton transfer weighted (APTw) modality
Fig. 2
Fig. 2
The PRISMA flow diagram for the followed search methodology
Fig. 3
Fig. 3
The graph depicts the gradual increase of studies on data augmentation with synthetic MRI examinations to enhance cancer differentiation. In the year 2020, the number of studies exceeded the total number of the previous three years indicating the interest of researchers in the field
Fig. 4
Fig. 4
Summary results of QUADAS-2 tool on risk of bias and applicability concerns for the included studies in the present systematic review
Fig. 5
Fig. 5
A schematic view of variants of GANs and VAE. a The primary idea of the DCGAN compared to vanilla GAN is that adds transposed convolutional layers between the input vector Z and the output image in the generator. In addition, the discriminator incorporates convolutional layers to classify the generated and real images with the corresponding label real or synthetic. b Training a GAN is not trivial. Such models may never converge and issues such as model collapses and vanishing of gradients are common. WGAN proposes a new cost function using Wasserstein distance that has a smoother gradient. The discriminator is referred to as the critic who returns a value in a range, instead of 0 or 1, and therefore acts less strictly. c The training in PGGAN starts with a single convolution block in both generator and discriminator leading to 4 x 4 synthetic images. Real images are downsampled also to be of size 4 x 4. After a few iterations, another layer of convolution is introduced in both networks until desired resolution (e.g., 256 x 256 in the schematic). By progressively growing the network learns high-level structures first followed by finer-scale details available at higher resolutions. d In contrast to traditional autoencoders, VAE is both probabilistic and generative. The encoder learns the mean codings, μ, and standard deviation codings, σ. Therefore the model is capable of randomly sample from a Gaussian distribution and generating the latent variables Z. These latent variables are then “decoded” to reconstruct the input
Fig. 6
Fig. 6
Generative architectures for image-to-image translation. a The pix2pix is an extension of the conditional GAN architecture that provides control over the generated image. The U-net model generator translates images from one domain to another and through skip connections the low-level features are shared. The discriminator judges whether a patch of an image is real or synthetic instead of judging the whole image, while the modified loss function allows the generated image to be plausible in the content of the target domain. b CycleGAN is designed specifically to perform image-to-image translation on unpaired sets of images. The architecture uses two generators and two discriminators. The two generators are often variations of autoencoders where they take as input an image and output an image as output; the discriminator, however, takes as input an image and outputs one single value. In the case of CycleGAN, a generator gets further feedback from the other generator. This feedback confirms whether an image generated by a generator is cycle consistent, meaning that applying successively both generators on an image should produce a similar image. c In the MUNIT architecture, the image representation is decomposed into a content code and style code through the respective encoders. The content code and style code is recombined to translate an image to the target domain. By sampling different style codes the model is capable of producing diverse and multimodal outputs
Fig. 7
Fig. 7
The family of architectures proposed in the examined studies. WGANs, Wasserstein generative adversarial networks; PGGANs, progressive growing of generative adversarial networks; DCGANs, deep convolutional generative adversarial networks. Almost half of the examined studies employed translation architectures (i.e., pix2pix, cycleGAN, MUNIT) to translate from one MRI sequence to another or to incorporate different types of lesions into a healthy subject. The hybrid architectures consist of the combination of GANs and VAE to increase the stability of the training and to generate higher-quality synthetic images. The studies with the remained architectures focused on generating MRI images from a noise vector
Fig. 8
Fig. 8
Key findings and a proposed pipeline by the examined studies. a Depicts the synthetic samples in each MRI sequence [39]. b Example of T1 contrast-enhanced synthetic tumor and normal examinations in both successful and failed cases [41]. c1 The proposed noise-to-image and image-to-image combined architectures for tumor detection [42]. c2 Example of T1 contrast-enhanced synthetic tumor and normal examinations in both successful and failed cases [42]. d Synthetic T1 contrast-enhanced samples with the tumor bounding boxes [44]. By “non-tumor” areas the authors refer to “normal examinations”
Fig. 9
Fig. 9
Key generated samples for the examined studies. a The input of the AsynDGAN network, the generated sample and the corresponding real image [47, 48]. b Generated images conditioned on lesion masks [50]. c An example of generated images with the corresponding segmentation and groundtruth. The colors mean yellow: edema, blue: non-enhancing, and green: enhancing tumor. The 3D representation of the tumor is presented on the top right [56]. d Synthetic samples of severe cases of brain tumor, for better visualization the authors displayed color-mapped images where yellow indicates higher and blue indicates lower intensity [52]
Fig. 10
Fig. 10
Evaluation methods for assessing the generative process: an indirect metric for the downstream task (i.e., classification, segmentation, detection, etc.) where is calculated the performance prior to and after sample generation. Qualitative analysis is where expert clinicians assess the generated images with statistical methods or the via Visual Turing Test. Direct assessment of generated samples with image quality metrics (e.g., MSE, FID, IS, etc.), and studies without any metric
Fig. 11
Fig. 11
Qualitative methods were used in the examined studies for evaluating the generated samples. Almost half of the examined studies evaluated the synthetic images by visualization, whereas 36.1% employed expert clinicians to assess the generated samples using statistical methods or an operator-assisted device that produces a stochastic sequence of binary questions from a given test image (i.e., Visual Turing Test). The 11.1% used cluster visualization methods such as PCA and t-SNE, and a small percentage (5.5%) did not use any qualitative method to assess the synthetic images

References

    1. Krizhevsky A, Sutskever I, Hinton GE (2017) Imagenet classification with deep convolutional neural networks. In: communications of the ACM. NIPS’12 vol 60, pp 84–90 Curran Associates Inc., Red Hook, NY, USA 10.1145/3065386
    1. Jia Deng, Wei Dong, Socher, R. Li-Jia Li, Kai Li, Li Fei-Fei (2009) ImageNet: a large-scale hierarchical image database. 10.1109/cvprw.2009.5206848
    1. Hinton GE, Shallice T. Psychol Rev. Lesioning an attractor network: Investigations of acquired dyslexia. 1991 doi: 10.1037//0033-295x.98.1.74. - DOI - PubMed
    1. Ian Goodfellow YB, Courville A (2016) Deep learning deep learning 29:1–73
    1. WIENER N. Time, communication, and the nervous system. Ann N Y Acad Sci. 1948;50(4):197–220. doi: 10.1111/J.1749-6632.1948.TB39853.X. - DOI - PubMed

LinkOut - more resources