Essential genes identification model based on sequence feature map and graph convolutional neural network
- PMID: 38200437
- PMCID: PMC10777564
- DOI: 10.1186/s12864-024-09958-w
Essential genes identification model based on sequence feature map and graph convolutional neural network
Abstract
Background: Essential genes encode functions that play a vital role in the life activities of organisms, encompassing growth, development, immune system functioning, and cell structure maintenance. Conventional experimental techniques for identifying essential genes are resource-intensive and time-consuming, and the accuracy of current machine learning models needs further enhancement. Therefore, it is crucial to develop a robust computational model to accurately predict essential genes.
Results: In this study, we introduce GCNN-SFM, a computational model for identifying essential genes in organisms, based on graph convolutional neural networks (GCNN). GCNN-SFM integrates a graph convolutional layer, a convolutional layer, and a fully connected layer to model and extract features from gene sequences of essential genes. Initially, the gene sequence is transformed into a feature map using coding techniques. Subsequently, a multi-layer GCN is employed to perform graph convolution operations, effectively capturing both local and global features of the gene sequence. Further feature extraction is performed, followed by integrating convolution and fully-connected layers to generate prediction results for essential genes. The gradient descent algorithm is utilized to iteratively update the cross-entropy loss function, thereby enhancing the accuracy of the prediction results. Meanwhile, model parameters are tuned to determine the optimal parameter combination that yields the best prediction performance during training.
Conclusions: Experimental evaluation demonstrates that GCNN-SFM surpasses various advanced essential gene prediction models and achieves an average accuracy of 94.53%. This study presents a novel and effective approach for identifying essential genes, which has significant implications for biology and genomics research.
Keywords: Bioinformatics; Essential genes; Gene sequences; Graphical convolutional neural networks; Machine learning.
© 2024. The Author(s).
Conflict of interest statement
The authors declare no competing interests.
Figures








Similar articles
-
DeepPN: a deep parallel neural network based on convolutional neural network and graph convolutional network for predicting RNA-protein binding sites.BMC Bioinformatics. 2022 Jun 29;23(1):257. doi: 10.1186/s12859-022-04798-5. BMC Bioinformatics. 2022. PMID: 35768792 Free PMC article.
-
Multipath Cross Graph Convolution for Knowledge Representation Learning.Comput Intell Neurosci. 2021 Dec 28;2021:2547905. doi: 10.1155/2021/2547905. eCollection 2021. Comput Intell Neurosci. 2021. PMID: 34992642 Free PMC article.
-
Protein Subcellular Localization Prediction Model Based on Graph Convolutional Network.Interdiscip Sci. 2022 Dec;14(4):937-946. doi: 10.1007/s12539-022-00529-9. Epub 2022 Jun 17. Interdiscip Sci. 2022. PMID: 35713780
-
Machine learning approach to gene essentiality prediction: a review.Brief Bioinform. 2021 Sep 2;22(5):bbab128. doi: 10.1093/bib/bbab128. Brief Bioinform. 2021. PMID: 33842944 Review.
-
Prediction of Multimorbidity Network Evolution in Middle-Aged and Elderly Population Based on CE-GCN.Interdiscip Sci. 2025 Jun;17(2):424-436. doi: 10.1007/s12539-024-00685-0. Epub 2025 Feb 10. Interdiscip Sci. 2025. PMID: 39930307 Free PMC article. Review.
Cited by
-
DNA sequence analysis landscape: a comprehensive review of DNA sequence analysis task types, databases, datasets, word embedding methods, and language models.Front Med (Lausanne). 2025 Apr 8;12:1503229. doi: 10.3389/fmed.2025.1503229. eCollection 2025. Front Med (Lausanne). 2025. PMID: 40265190 Free PMC article. Review.
-
A hybrid machine learning model with attention mechanism and multidimensional multivariate feature coding for essential gene prediction.BMC Biol. 2025 Apr 24;23(1):108. doi: 10.1186/s12915-025-02209-8. BMC Biol. 2025. PMID: 40275343 Free PMC article.
References
-
- O’Neill RS, Clark DV. The Drosophila melanogaster septin gene Sep2 has a redundant function with the retrogene Sep5 in imaginal cell proliferation but is essential for oogenesis. Genome. 2013;56(12):753–758. - PubMed
-
- Juhas M, Eberl L, Glass JI. Essence of life: essential genes of minimal genomes. Trends Cell Biol. 2011;21(10):562–568. - PubMed
-
- Juhas M, Reuß DR, Zhu B, Commichau FM. Bacillus subtilis and Escherichia coli essential genes and minimal cell factories after one decade of genome engineering. Microbiology. 2014;160(11):2341–2351. - PubMed
MeSH terms
Grants and funding
LinkOut - more resources
Full Text Sources