An Improved Conditional Wasserstein GAN With Gradient Penalty for Gene Expression Profiling Data Augmentation Based on Data Segmentation and Depth Feature Constraint

doi:10.1109/TCBBIO.2025.3560097

. 2025 Jul-Aug;22(4):1401-1414.

doi: 10.1109/TCBBIO.2025.3560097.

An Improved Conditional Wasserstein GAN With Gradient Penalty for Gene Expression Profiling Data Augmentation Based on Data Segmentation and Depth Feature Constraint

Fei Han, Yutao Liang, Qinghua Ling, Henry Han

PMID: 40811341
DOI: 10.1109/TCBBIO.2025.3560097

An Improved Conditional Wasserstein GAN With Gradient Penalty for Gene Expression Profiling Data Augmentation Based on Data Segmentation and Depth Feature Constraint

Fei Han et al. IEEE Trans Comput Biol Bioinform. 2025 Jul-Aug.

. 2025 Jul-Aug;22(4):1401-1414.

doi: 10.1109/TCBBIO.2025.3560097.

Authors

Fei Han, Yutao Liang, Qinghua Ling, Henry Han

PMID: 40811341
DOI: 10.1109/TCBBIO.2025.3560097

Abstract

In practical medical diagnosis, small sample sizes in gene expression profiling data can lead to overfitting. Addressing this, we leverage the potential of the Conditional Wasserstein Generative Adversarial Network with Gradient Penalty (CWGAN-GP) to amplify data volumes. However, it lacks control over the locations of generated samples and struggles to get a better balance between discriminators and generators during training. To overcome these hurdles, we propose the Improved CWGAN-GP, implementing two critical improvements. The first involves the adoption of a data segmentation strategy based on sample influence scores. By calculating the influence score for each sample, we prioritize samples at decision boundaries and outside the distributions as the training set, thus yielding more explicit decision boundaries. The second enhancement is that a depth feature constraint based on the Pearson correlation coefficient is proposed. Here, an encoder extracts the deep features, applying a constraint between the noise and deep features guided by the Pearson correlation coefficient. This strategy navigates the model closer to a Nash equilibrium. Empirical evaluations conducted on six publicly available gene expression profiling datasets validate our approach, demonstrating that it not only generates higher quality samples but also showcases superior stability compared to existing methods.

PubMed Disclaimer

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

An Improved Conditional Wasserstein GAN With Gradient Penalty for Gene Expression Profiling Data Augmentation Based on Data Segmentation and Depth Feature Constraint

An Improved Conditional Wasserstein GAN With Gradient Penalty for Gene Expression Profiling Data Augmentation Based on Data Segmentation and Depth Feature Constraint

Authors

Abstract