AD-VAE: Adversarial Disentangling Variational Autoencoder

Adson Silva¹, Ricardo Farias¹

Affiliations

PMID: 40096455
PMCID: PMC11902370
DOI: 10.3390/s25051574

AD-VAE: Adversarial Disentangling Variational Autoencoder

Adson Silva et al. Sensors (Basel). 2025.

. 2025 Mar 4;25(5):1574.

doi: 10.3390/s25051574.

Authors

Adson Silva¹, Ricardo Farias¹

Affiliation

¹ Systems Engineering and Computer Science Program (PESC/COPPE/UFRJ), Federal University of Rio de Janeiro, Rio de Janeiro 21941-972, Brazil.

PMID: 40096455
PMCID: PMC11902370
DOI: 10.3390/s25051574

Abstract

Face recognition (FR) is a less intrusive biometrics technology with various applications, such as security, surveillance, and access control systems. FR remains challenging, especially when there is only a single image per person as a gallery dataset and when dealing with variations like pose, illumination, and occlusion. Deep learning techniques have shown promising results in recent years using VAE and GAN, with approaches such as patch-VAE, VAE-GAN for 3D Indoor Scene Synthesis, and hybrid VAE-GAN models. However, in Single Sample Per Person Face Recognition (SSPP FR), the challenge of learning robust and discriminative features that preserve the subject's identity persists. To address these issues, we propose a novel framework called AD-VAE, specifically for SSPP FR, using a combination of variational autoencoder (VAE) and Generative Adversarial Network (GAN) techniques. The proposed AD-VAE framework is designed to learn how to build representative identity-preserving prototypes from both controlled and wild datasets, effectively handling variations like pose, illumination, and occlusion. The method uses four networks: an encoder and decoder similar to VAE, a generator that receives the encoder output plus noise to generate an identity-preserving prototype, and a discriminator that operates as a multi-task network. AD-VAE outperforms all tested state-of-the-art face recognition techniques, demonstrating its robustness. The proposed framework achieves superior results on four controlled benchmark datasets-AR, E-YaleB, CAS-PEAL, and FERET-with recognition rates of 84.9%, 94.6%, 94.5%, and 96.0%, respectively, and achieves remarkable performance on the uncontrolled LFW dataset, with a recognition rate of 99.6%. The AD-VAE framework shows promising potential for future research and real-world applications.

Keywords: GAN; face recognition; single sample.

PubMed Disclaimer

Conflict of interest statement

The authors declare no conflicts of interest.

Figures

**Figure 1**
The first part of the proposed AD-VAE, which works as a variational adversarial autoencoder. The $x$ denotes the image data from $X$ , and $x^{d e c}$ denotes the decoder reconstruction from $x$ . The encoder $E_{n c}$ has as input image $x$ and produces two outputs, the mean ( $μ$ ) and the log-variance ( $σ$ ), which define the parameters of a normal distribution $N (μ, σ)$ . From distribution $N (μ, σ)$ , we extract a latent vector $c \sim N (μ, σ)$ that serves as input to decoder $D_{e c}$ which outputs the reconstruction $x^{d e c}$ .

**Figure 2**
The second part of the proposed architecture of AD-VAE, where $x$ denotes the image from SSPP data $X$ , $x^{r p}$ denotes the image real prototype $x$ , and $\hat{x}$ is the generated prototype from image $x$ . The pre-trained (first part) encoder $E_{n c}$ generates the mean $μ$ and variation $σ$ of $x$ . From distribution $N (μ, σ)$ , we extract a latent vector $c \sim N (μ, σ)$ that concatenates with noise vector $z \sim N (0, 1)$ to serve as the input to generator $G_{e n}$ which outputs the prototype $\hat{x}$ of $x$ . The discriminator $D$ (1) determines the id and variation of $x$ ; (2) determines the id, variation, and whether $\hat{x}$ is real or fake; and (3) determines whether $x_{r p}$ is real or fake.

**Figure 3**
The prototypes generated by AD-VAE are presented as follows: (a) the sample image with variations, (b) the generated prototype of image (a), and (c) the real prototype of image (a). On the right side, the name of the dataset and the variation are indicated.

See this image and copyright information in PMC

References

1. Lahasan B., Lutfi S.L., San-Segundo R. A survey on techniques to handle face recognition challenges: Occlusion, single sample per subject and expression. Artif. Intell. Rev. 2017;52:949–979. doi: 10.1007/s10462-017-9578-y. - DOI
1. Liu F., Chen D., Wang F., Li Z., Xu F. Deep learning based single sample face recognition: A survey. Artif. Intell. Rev. 2023;56:2723–2748. doi: 10.1007/s10462-022-10240-2. - DOI
1. Minaee S., Abdolrashidi A., Su H., Bennamoun M., Zhang D. Biometrics recognition using deep learning: A survey. Artif. Intell. Rev. 2023;56:8647–8695. doi: 10.1007/s10462-022-10237-x. - DOI
1. Zhao W., Chellappa R., Phillips P.J., Rosenfeld A. Face recognition: A literature survey. ACM Comput. Surv. 2003;35:399–458. doi: 10.1145/954339.954342. - DOI
1. Deng W., Hu J., Guo J. In Defense of Sparsity Based Face Recognition; Proceedings of the 2013 IEEE Conference on Computer Vision and Pattern Recognition; Portland, OR, USA. 23–28 June 2013; pp. 399–406.

MeSH terms

Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions

Grants and funding

LinkOut - more resources

Full Text Sources
- MDPI
- PubMed Central
Research Materials
- NCI CPTC Antibody Characterization Program

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

AD-VAE: Adversarial Disentangling Variational Autoencoder

Affiliation

AD-VAE: Adversarial Disentangling Variational Autoencoder

Authors

Affiliation

Abstract

Conflict of interest statement

Figures

References

MeSH terms

Grants and funding

LinkOut - more resources

Full Text Sources

Research Materials