Comparative Study

. 2025 Aug;52(8):e18038.

doi: 10.1002/mp.18038.

A comprehensive comparative study of generative adversarial network architectures for synthetic computed tomography generation in the abdomen

Mariia Lapaeva^{1

2

3}, Agustina La Greca Saint-Esteven^{1

3}, Philipp Wallimann¹, Nicolaus Andratschke¹, Matthias Guckenberger¹, Manuel Günther², Stephanie Tanadini-Lang¹, Riccardo Dal Bello¹

Affiliations

¹ Department of Radiation Oncology, University Hospital Zurich and University of Zurich, Zurich, Switzerland.
² Artificial Intelligence and Machine Learning Group, Department of Informatics, University of Zurich, Zurich, Switzerland.
³ Computer Vision Laboratory, ETH Zurich, Zurich, Switzerland.

PMID: 40804793
PMCID: PMC12351101
DOI: 10.1002/mp.18038

Comparative Study

A comprehensive comparative study of generative adversarial network architectures for synthetic computed tomography generation in the abdomen

Mariia Lapaeva et al. Med Phys. 2025 Aug.

. 2025 Aug;52(8):e18038.

doi: 10.1002/mp.18038.

Authors

Affiliations

¹ Department of Radiation Oncology, University Hospital Zurich and University of Zurich, Zurich, Switzerland.
² Artificial Intelligence and Machine Learning Group, Department of Informatics, University of Zurich, Zurich, Switzerland.
³ Computer Vision Laboratory, ETH Zurich, Zurich, Switzerland.

PMID: 40804793
PMCID: PMC12351101
DOI: 10.1002/mp.18038

Abstract

Background: Magnetic Resonance (MR)-based synthetic Computed Tomography (sCT) generation is an emerging promising technique, required for the transition from conventional planning workflows to MR-only radiotherapy planning. This shift aims to replace CT acquisition with a sCT improving both cost efficiency and burden to the patient. Generative Adversarial Networks (GANs) have shown some of the best performance in this area.

Purpose: This study aims to identify optimal approaches to improve the quality and clinical applicability of MR-based sCT generation for treatment planning by performing an extensive comparison of GAN architectures and parameters thereof. It focuses on the abdominal region, which still lacks certified medical products for sCT generation.

Methods: In order to improve the current state of deep learning technologies, we generated sCTs based on abdominal MR images of 154 cancer patients using GANs, varying the following parameters: (1) generator architectures (U-Net, ResNet); (2) GAN architectures trained in paired (Pix2Pix) and unpaired fashion (CycleGAN and CUT); (3) number of input-output channels (2D, 2.5D); (4) training set size. The quality of sCT generation was assessed by using both image similarity and dosimetric metrics; correlation between the two was evaluated. The dosimetric accuracy was evaluated through an automated process that compared the dose distributions of photon treatment plans calculated on sCT and CT images, using Dose-Volume Histogram (DVH) parameters for tumor and organs at risk.

Results: The Pix2Pix model, trained in paired fashion with 2.5D input-output channels and a ResNet generator emerged as the best-performing model, achieving a mean absolute error (MAE, mean) of 63.21 HU, a planning target volume Dmean difference of -0.09%, and no outliers above 2% for other DVH parameters. This configuration addressed prior challenges of Pix2Pix with bone and rigid organ boundary generation, delivering robust results even for cases with significant air pockets. The 2D input-output channel configuration showed beneficial for GANs trained in unpaired fashion, achieving a mean MAE of 66.97 HU for CycleGAN and 69.49 HU for CUT. Both delivered clinically applicable results, with mean DVH discrepancies below 0.8%. Expanding the training set size was essential for minimizing outliers in dosimetric parameters. High correlation was observed between the image similarity metrics-MAE, MAE bones, structural similarity index measure-and target DVH parameters, with Pearson coefficients ranging from 0.77 to 0.9. However, within the clinically relevant range of DVH deviations (± 2%), stochastic variations obscured linear trends.

Conclusions: The study provided a new benchmark for the abdominal sCT generation task, showing its clinical applicability for treatment planning and further advancing the state-of-the-art. This study also confirmed that image similarity metrics alone can not reliably predict small dosimetric deviations within a clinical threshold; but contributed by identifying specific metrics that correlate with DVH discrepancies above ± 5%, offering valuable tools for training, evaluation, and standardization of reporting across studies.

Keywords: MR‐only radiotherapy; deep learning; generative adversarial networks; medical image analysis; synthetic CT.

PubMed Disclaimer

Conflict of interest statement

The Department of Radiation Oncology, University Hospital Zurich has teaching and research agreements with Siemens Healthineers. University Hospital Zurich had teaching and research agreements with Viewray Inc.

Figures

**FIGURE 1**
Schematic representation of the study design. The study is aimed to answer five main research questions, regarding (RQ1) the generator architectures for producing sCTs, (RQ2) GAN architectures trained in paired and unpaired fashion, (RQ3) the number of input‐output channels, (RQ4) the impact of the training set size and (RQ5) whether image similarity metrics correlate with dosimetric metrics.

**FIGURE 2**
Research question 1. Example of sCT volumes generated with the help of U‐Net and ResNet generators for an adrenal gland case. The axial and sagittal slices reveal the ability of the ResNet to obtain sharper organ boundaries while outperforming U‐Net in the ability to avoid manual air and soft tissue bulk density override (Tissue OR) steps by automatically filling these structures correctly. The coronal slices show the performance in the bone region. The window width and window level of the shown CT and sCT images are 844 and ‐100 HU.

**FIGURE 3**
Research question 1. Differences in DVH dosimetric indicators between plans calculated on dCT and sCT, generated with Pix2Pix models, employing U‐Net and ResNet generators. The number of outliers below ‐2% and above 2% is shown next to the red lines for each DVH indicator. The right panel reports the results of the Wilcoxon‐Pratt test, with each DVH parameter evaluated independently (Table S5). The significance level p = 0.05 is highlighted with a vertical line and no values below are observed.

**FIGURE 4**
Research question 2. Differences in the dosimetric DVH indicators between plans calculated on dCT and sCT, generated with the Pix2Pix model, trained in paired fashion, and unpairedly‐trained CycleGAN and CUT. The number of outliers below ‐2% and above 2% is shown next to the red lines for each DVH indicator. The right panel reports the results of the Friedman test, with each DVH parameter evaluated independently (Table S5). The significance level p = 0.05 is highlighted with a vertical line.

**FIGURE 5**
Research question 2. Example of sCT volumes generated with the help of architectures trained in paired (Pix2Pix) and unpaired fashion (CycleGAN, CUT) for a pancreas case, *having an extreme case of air volume*. Air OR and Tissue OR contours define regions for air and soft tissue density overrides (OR), ensuring correct placement on deformed CT relative to MR for treatment planning. The coronal slices show the superior results of the unpaired trained architectures in rib formation. While Pix2Pix is less accurate in generating the air pocket area and tends to fill it with the HU values of the tissue, CycleGAN and CUT tend to generate additional air pockets in the case of extreme air volumes in a patient, as shown in the axial and sagittal slices. The window width and window level of the shown CT and sCT images are 844 and ‒100 HU.

**FIGURE 6**
Research question 3. Examples of sCT volumes generated utilizing 2D and 2.5D input‐output channel configurations for a liver case. Air OR and Tissue OR contours define regions for air and soft tissue density overrides (OR), ensuring correct placement on deformed CT relative to MR for treatment planning. The Pix2Pix 2.5D configuration induced a strong improvement in the area of the spine. However, for architectures trained in an unpaired fashion, additional distortions were observed for 2.5D: intense blurriness for CycleGAN and for CUT caused liver enlargement. The window width and window level of the shown CT and sCT images are 844 and ‐100 HU.

**FIGURE 7**
Research question 4. The analysis shows how dosimetric metrics, including all DVH parameters, evolve as the size of the training set increases (40 axial slices per 3D volume). The test cohorts remained the same for all models. The number of outliers above 2% is shown on the red line for each training set size.

**FIGURE 8**
Research question 5. Pearson (a) and Spearman (b) rank correlation coefficient between image similarity and dosimetric metrics (stronger correlation—darker color) are shown on the left side. Correlation diagrams (c) between PTV Dmean difference (Abs. Dose diff., % = abs(sCTdose—dCTdose)/dCTdose *100) and strongly correlated image similarity metrics (MAE, MAE Bones, MSE, SSIM) are shown on the right. The red line on the scatterplots is a Locally Estimated Scatterplot Smoothing (LOESS) curve, which visually represents how the values of one variable are related to the values of another variable in a local, non‐parametric manner. Straight red line indicates a more consistent linear relationship between the metrics.

See this image and copyright information in PMC

References

1. Guckenberger M, Andratschke N, Chung C, Fuller D, Tanadini‐Lang S, Jaffray DA. The future of mr‐guided radiation therapy. Semin Radiat Oncol. 2024;34(1):135‐144. doi: 10.1016/j.semradonc.2023.10.015 - DOI - PubMed
1. Johnstone E, Wyatt JJ, Henry AM, et al. Systematic review of synthetic computed tomography generation methodologies for use in magnetic resonance imaging‐only radiation therapy. Int J Radiat Oncol Biol Phys. 2018;100(1):199‐217. doi: 10.1016/j.ijrobp.2017.08.043 - DOI - PubMed
1. Chin AL, Lin A, Anamalayil S, Teo BKK. Feasibility and limitations of bulk density assignment in MRI for head and neck IMRT treatment planning. J Appl Clin Med Phys. 2014;15(5):100‐111. doi: 10.1120/jacmp.v15i5.4851 - DOI - PMC - PubMed
1. Edmund JM, Nyholm T. A review of substitute CT generation for MRI‐only radiation therapy. Radiat Oncol. 2017;12(1):28. doi: 10.1186/s13014-016-0747-y - DOI - PMC - PubMed
1. Dowling JA, Sun J, Pichler P, et al. Automatic substitute computed tomography generation and contouring for magnetic resonance imaging (MRI)‐alone external beam radiation therapy from standard MRI sequences. Int J Radiat Oncol Biol Phys. 2015;93(5):1144‐1153. doi: 10.1016/j.ijrobp.2015.08.045 - DOI - PubMed

Publication types

Actions

MeSH terms

Actions
Actions
Actions
Actions
Actions
Actions
Actions

Grants and funding

LinkOut - more resources

Full Text Sources
Medical
- MedlinePlus Consumer Health Information
- MedlinePlus Health Information

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

A comprehensive comparative study of generative adversarial network architectures for synthetic computed tomography generation in the abdomen

Affiliations

A comprehensive comparative study of generative adversarial network architectures for synthetic computed tomography generation in the abdomen

Authors

Affiliations

Abstract

Conflict of interest statement

Figures

Similar articles

References

Publication types

MeSH terms

Grants and funding

LinkOut - more resources

Full Text Sources

Medical

Abstract

Conflict of interest statement

Figures

Similar articles

References

Publication types

MeSH terms

Related information

Grants and funding

LinkOut - more resources

Full Text Sources

Medical