Generative Adversarial Networks for Creating Synthetic Nucleic Acid Sequences of Cat Genome
- PMID: 35409058
- PMCID: PMC8998662
- DOI: 10.3390/ijms23073701
Generative Adversarial Networks for Creating Synthetic Nucleic Acid Sequences of Cat Genome
Abstract
Nucleic acids are the basic units of deoxyribonucleic acid (DNA) sequencing. Every organism demonstrates different DNA sequences with specific nucleotides. It reveals the genetic information carried by a particular DNA segment. Nucleic acid sequencing expresses the evolutionary changes among organisms and revolutionizes disease diagnosis in animals. This paper proposes a generative adversarial networks (GAN) model to create synthetic nucleic acid sequences of the cat genome tuned to exhibit specific desired properties. We obtained the raw sequence data from Illumina next generation sequencing. Various data preprocessing steps were performed using Cutadapt and DADA2 tools. The processed data were fed to the GAN model that was designed following the architecture of Wasserstein GAN with gradient penalty (WGAN-GP). We introduced a predictor and an evaluator in our proposed GAN model to tune the synthetic sequences to acquire certain realistic properties. The predictor was built for extracting samples with a promoter sequence, and the evaluator was built for filtering samples that scored high for motif-matching. The filtered samples were then passed to the discriminator. We evaluated our model based on multiple metrics and demonstrated outputs for latent interpolation, latent complementation, and motif-matching. Evaluation results showed our proposed GAN model achieved 93.7% correlation with the original data and produced significant outcomes as compared to existing models for sequence generation.
Keywords: WGAN-GP; cat genome; generative adversarial networks; motif matching; nucleic acid sequences; promoter classification; promoter prediction; synthetic genome.
Conflict of interest statement
The authors declare no conflict of interest regarding the design of this study, analyses and writing of this manuscript.
Figures









Similar articles
-
Enhancing classification of cells procured from bone marrow aspirate smears using generative adversarial networks and sequential convolutional neural network.Comput Methods Programs Biomed. 2022 Sep;224:107019. doi: 10.1016/j.cmpb.2022.107019. Epub 2022 Jul 10. Comput Methods Programs Biomed. 2022. PMID: 35878483
-
Adversarial denoising of EEG signals: a comparative analysis of standard GAN and WGAN-GP approaches.Front Hum Neurosci. 2025 May 6;19:1583342. doi: 10.3389/fnhum.2025.1583342. eCollection 2025. Front Hum Neurosci. 2025. PMID: 40395688 Free PMC article.
-
Synthesis of Microscopic Cell Images Obtained from Bone Marrow Aspirate Smears through Generative Adversarial Networks.Biology (Basel). 2022 Feb 10;11(2):276. doi: 10.3390/biology11020276. Biology (Basel). 2022. PMID: 35205142 Free PMC article.
-
Parallel Connected Generative Adversarial Network with Quadratic Operation for SAR Image Generation and Application for Classification.Sensors (Basel). 2019 Feb 19;19(4):871. doi: 10.3390/s19040871. Sensors (Basel). 2019. PMID: 30791500 Free PMC article.
-
Synthesizing anonymized and labeled TOF-MRA patches for brain vessel segmentation using generative adversarial networks.Comput Biol Med. 2021 Apr;131:104254. doi: 10.1016/j.compbiomed.2021.104254. Epub 2021 Feb 15. Comput Biol Med. 2021. PMID: 33618105
Cited by
-
The Use of AI for Phenotype-Genotype Mapping.Methods Mol Biol. 2025;2952:369-410. doi: 10.1007/978-1-0716-4690-8_21. Methods Mol Biol. 2025. PMID: 40553344
-
Editorial of Special Issue "Deep Learning and Machine Learning in Bioinformatics".Int J Mol Sci. 2022 Jun 14;23(12):6610. doi: 10.3390/ijms23126610. Int J Mol Sci. 2022. PMID: 35743052 Free PMC article.
-
The development of the generative adversarial supporting vector machine for molecular property generation.J Cheminform. 2025 Jul 7;17(1):100. doi: 10.1186/s13321-025-01052-x. J Cheminform. 2025. PMID: 40624555 Free PMC article.
-
DeepB3P: A transformer-based model for identifying blood-brain barrier penetrating peptides with data augmentation using feedback GAN.J Adv Res. 2025 Jul;73:459-468. doi: 10.1016/j.jare.2024.08.002. Epub 2024 Aug 5. J Adv Res. 2025. PMID: 39111628 Free PMC article.
-
Progress of the "Molecular Informatics" Section in 2022.Int J Mol Sci. 2023 May 29;24(11):9442. doi: 10.3390/ijms24119442. Int J Mol Sci. 2023. PMID: 37298393 Free PMC article.
References
-
- Nouws S., Bogaerts B., Verhaegen B., Denayer S., Piérard D., Marchal K., Roosens N.H., Vanneste K., De Keersmaecker S.C. Impact of DNA extraction on whole genome sequencing analysis for characterization and relatedness of Shiga toxin-producing Escherichia coli isolates. Sci. Rep. 2020;10:14649. doi: 10.1038/s41598-020-71207-3. - DOI - PMC - PubMed
-
- Goodfellow I., Pouget-Abadie J., Mirza M., Xu B., Warde-Farley D., Ozair S., Courville A., Bengio Y. Generative adversarial nets. Adv. Neural Inf. Process. Syst. 2014;27 doi: 10.48550/ARXIV.1406.2661. - DOI
MeSH terms
Substances
Grants and funding
LinkOut - more resources
Full Text Sources
Miscellaneous