Review

. 2022 Nov 6;22(21):8540.

doi: 10.3390/s22218540.

Unsupervised Image-to-Image Translation: A Review

Henri Hoyez^{1

2}, Cédric Schockaert¹, Jason Rambach³, Bruno Mirbach³, Didier Stricker^{1

2}

Affiliations

¹ Paul Wurth S.A., 1122 Luxembourg, Luxembourg.
² Department Computer Science, Technische Universität Kaiserslautern, 67663 Kaiserslautern, Germany.
³ German Research Center for Artificial Intelligence (DFKI), 67663 Kaiserslautern, Germany.

PMID: 36366238
PMCID: PMC9654990
DOI: 10.3390/s22218540

Review

Unsupervised Image-to-Image Translation: A Review

Henri Hoyez et al. Sensors (Basel). 2022.

. 2022 Nov 6;22(21):8540.

doi: 10.3390/s22218540.

Authors

Henri Hoyez^{1

2}, Cédric Schockaert¹, Jason Rambach³, Bruno Mirbach³, Didier Stricker^{1

2}

Affiliations

¹ Paul Wurth S.A., 1122 Luxembourg, Luxembourg.
² Department Computer Science, Technische Universität Kaiserslautern, 67663 Kaiserslautern, Germany.
³ German Research Center for Artificial Intelligence (DFKI), 67663 Kaiserslautern, Germany.

PMID: 36366238
PMCID: PMC9654990
DOI: 10.3390/s22218540

Abstract

Supervised image-to-image translation has been proven to generate realistic images with sharp details and to have good quantitative performance. Such methods are trained on a paired dataset, where an image from the source domain already has a corresponding translated image in the target domain. However, this paired dataset requirement imposes a huge practical constraint, requires domain knowledge or is even impossible to obtain in certain cases. Due to these problems, unsupervised image-to-image translation has been proposed, which does not require domain expertise and can take advantage of a large unlabeled dataset. Although such models perform well, they are hard to train due to the major constraints induced in their loss functions, which make training unstable. Since CycleGAN has been released, numerous methods have been proposed which try to address various problems from different perspectives. In this review, we firstly describe the general image-to-image translation framework and discuss the datasets and metrics involved in the topic. Furthermore, we revise the current state-of-the-art with a classification of existing works. This part is followed by a small quantitative evaluation, for which results were taken from papers.

Keywords: computer vision; deep learning; generative adversarial networks; machine learning; review; unsupervised image-to-image translation.

PubMed Disclaimer

Conflict of interest statement

The authors declare no conflict of interest.

Figures

**Figure 1**
Number of papers published per year in deep learning, generative adversarial networks, image-translation and image-to-image translation. Papers from the ArXiV database were counted.

**Figure 2**
Dataset organization where we classify datasets given the main characteristics.

**Figure 3**
Categorization of metrics their main characteristics.

**Figure 4**
The diffusion process. Image taken from the Creative Commons website and has been released to the public domain by the author.

**Figure 5**
General attribute editing process in I2I translation methods. E is the style encoder, F is the mapping network that permits the model to explore more modes, G is the generator, D is the discriminator and C is the attribute classifier introduced in [10]. Face images were taken from StarGAN v2 [54].

**Figure 6**
General case of a contrastive learning method. Using anchor and data augmentation methods, the model can learn useful representations.

**Figure 7**
Dataset usage of papers in this review. (a) The use of datasets in the papers of this review; (b) the metric usage of the reviewed papers.

See this image and copyright information in PMC

References

1. Isola P., Zhu J.Y., Zhou T., Efros A.A. Image-to-Image Translation with Conditional Adversarial Networks; Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR); Honolulu, HI, USA. 21–26 July 2017; pp. 5967–5976. - DOI
1. Wang T.C., Liu M.Y., Zhu J.Y., Tao A., Kautz J., Catanzaro B. High-Resolution Image Synthesis and Semantic Manipulation with Conditional GANs. [(accessed on 31 December 2018)]; Available online: http://xxx.lanl.gov/abs/1711.11585.
1. Bousmalis K., Silberman N., Dohan D., Erhan D., Krishnan D. Unsupervised Pixel-Level Domain Adaptation with Generative Adversarial Networks; Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR); Honolulu, HI, USA. 21–26 July 2017; pp. 95–104. - DOI
1. Liu M.Y., Breuel T., Kautz J. Unsupervised Image-to-Image Translation Networks. arXiv. 20171703.00848
1. Taigman Y., Polyak A., Wolf L. Unsupervised Cross-Domain Image Generation. [(accessed on 31 December 2017)]; Available online: http://xxx.lanl.gov/abs/1611.02200[cs]

Publication types

Actions

MeSH terms

Actions

Grants and funding

15411817/Fonds National de la Recherche

LinkOut - more resources

Full Text Sources
Other Literature Sources
- The Lens - Patent Citations Database

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Unsupervised Image-to-Image Translation: A Review

Affiliations

Unsupervised Image-to-Image Translation: A Review

Authors

Affiliations

Abstract

Conflict of interest statement

Figures

References

Publication types

MeSH terms

Grants and funding

LinkOut - more resources

Full Text Sources

Other Literature Sources