Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2025 Jul-Aug;22(4):1401-1414.
doi: 10.1109/TCBBIO.2025.3560097.

An Improved Conditional Wasserstein GAN With Gradient Penalty for Gene Expression Profiling Data Augmentation Based on Data Segmentation and Depth Feature Constraint

An Improved Conditional Wasserstein GAN With Gradient Penalty for Gene Expression Profiling Data Augmentation Based on Data Segmentation and Depth Feature Constraint

Fei Han et al. IEEE Trans Comput Biol Bioinform. 2025 Jul-Aug.

Abstract

In practical medical diagnosis, small sample sizes in gene expression profiling data can lead to overfitting. Addressing this, we leverage the potential of the Conditional Wasserstein Generative Adversarial Network with Gradient Penalty (CWGAN-GP) to amplify data volumes. However, it lacks control over the locations of generated samples and struggles to get a better balance between discriminators and generators during training. To overcome these hurdles, we propose the Improved CWGAN-GP, implementing two critical improvements. The first involves the adoption of a data segmentation strategy based on sample influence scores. By calculating the influence score for each sample, we prioritize samples at decision boundaries and outside the distributions as the training set, thus yielding more explicit decision boundaries. The second enhancement is that a depth feature constraint based on the Pearson correlation coefficient is proposed. Here, an encoder extracts the deep features, applying a constraint between the noise and deep features guided by the Pearson correlation coefficient. This strategy navigates the model closer to a Nash equilibrium. Empirical evaluations conducted on six publicly available gene expression profiling datasets validate our approach, demonstrating that it not only generates higher quality samples but also showcases superior stability compared to existing methods.

PubMed Disclaimer