Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Review
. 2022 Nov 6;22(21):8540.
doi: 10.3390/s22218540.

Unsupervised Image-to-Image Translation: A Review

Affiliations
Review

Unsupervised Image-to-Image Translation: A Review

Henri Hoyez et al. Sensors (Basel). .

Abstract

Supervised image-to-image translation has been proven to generate realistic images with sharp details and to have good quantitative performance. Such methods are trained on a paired dataset, where an image from the source domain already has a corresponding translated image in the target domain. However, this paired dataset requirement imposes a huge practical constraint, requires domain knowledge or is even impossible to obtain in certain cases. Due to these problems, unsupervised image-to-image translation has been proposed, which does not require domain expertise and can take advantage of a large unlabeled dataset. Although such models perform well, they are hard to train due to the major constraints induced in their loss functions, which make training unstable. Since CycleGAN has been released, numerous methods have been proposed which try to address various problems from different perspectives. In this review, we firstly describe the general image-to-image translation framework and discuss the datasets and metrics involved in the topic. Furthermore, we revise the current state-of-the-art with a classification of existing works. This part is followed by a small quantitative evaluation, for which results were taken from papers.

Keywords: computer vision; deep learning; generative adversarial networks; machine learning; review; unsupervised image-to-image translation.

PubMed Disclaimer

Conflict of interest statement

The authors declare no conflict of interest.

Figures

Figure 1
Figure 1
Number of papers published per year in deep learning, generative adversarial networks, image-translation and image-to-image translation. Papers from the ArXiV database were counted.
Figure 2
Figure 2
Dataset organization where we classify datasets given the main characteristics.
Figure 3
Figure 3
Categorization of metrics their main characteristics.
Figure 4
Figure 4
The diffusion process. Image taken from the Creative Commons website and has been released to the public domain by the author.
Figure 5
Figure 5
General attribute editing process in I2I translation methods. E is the style encoder, F is the mapping network that permits the model to explore more modes, G is the generator, D is the discriminator and C is the attribute classifier introduced in [10]. Face images were taken from StarGAN v2 [54].
Figure 6
Figure 6
General case of a contrastive learning method. Using anchor and data augmentation methods, the model can learn useful representations.
Figure 7
Figure 7
Dataset usage of papers in this review. (a) The use of datasets in the papers of this review; (b) the metric usage of the reviewed papers.

Similar articles

Cited by

References

    1. Isola P., Zhu J.Y., Zhou T., Efros A.A. Image-to-Image Translation with Conditional Adversarial Networks; Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR); Honolulu, HI, USA. 21–26 July 2017; pp. 5967–5976. - DOI
    1. Wang T.C., Liu M.Y., Zhu J.Y., Tao A., Kautz J., Catanzaro B. High-Resolution Image Synthesis and Semantic Manipulation with Conditional GANs. [(accessed on 31 December 2018)]; Available online: http://xxx.lanl.gov/abs/1711.11585.
    1. Bousmalis K., Silberman N., Dohan D., Erhan D., Krishnan D. Unsupervised Pixel-Level Domain Adaptation with Generative Adversarial Networks; Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR); Honolulu, HI, USA. 21–26 July 2017; pp. 95–104. - DOI
    1. Liu M.Y., Breuel T., Kautz J. Unsupervised Image-to-Image Translation Networks. arXiv. 20171703.00848
    1. Taigman Y., Polyak A., Wolf L. Unsupervised Cross-Domain Image Generation. [(accessed on 31 December 2017)]; Available online: http://xxx.lanl.gov/abs/1611.02200[cs]

MeSH terms

LinkOut - more resources