iRSpot-GAEnsC: identifing recombination spots via ensemble classifier and extending the concept of Chou's PseAAC to formulate DNA samples
- PMID: 26319782
- DOI: 10.1007/s00438-015-1108-5
iRSpot-GAEnsC: identifing recombination spots via ensemble classifier and extending the concept of Chou's PseAAC to formulate DNA samples
Abstract
Meiotic recombination is vital for maintaining the sequence diversity in human genome. Meiosis and recombination are considered the essential phases of cell division. In meiosis, the genome is divided into equal parts for sexual reproduction whereas in recombination, the diverse genomes are combined to form new combination of genetic variations. Recombination process does not occur randomly across the genomes, it targets specific areas called recombination "hotspots" and "coldspots". Owing to huge exploration of polygenetic sequences in data banks, it is impossible to recognize the sequences through conventional methods. Looking at the significance of recombination spots, it is indispensable to develop an accurate, fast, robust, and high-throughput automated computational model. In this model, the numerical descriptors are extracted using two sequence representation schemes namely: dinucleotide composition and trinucleotide composition. The performances of seven classification algorithms were investigated. Finally, the predicted outcomes of individual classifiers are fused to form ensemble classification, which is formed through majority voting and genetic algorithm (GA). The performance of GA-based ensemble model is quite promising compared to individual classifiers and majority voting-based ensemble model. iRSpot-GAEnsC has achieved 84.46 % accuracy. The empirical results revealed that the performance of iRSpot-GAEnsC is not only higher than the examined algorithms but also better than existing methods in the literature developed so far. It is anticipated that the proposed model might be helpful for research community, academia and for drug discovery.
Keywords: DNA; DNC; PNN; SVM; TNC.
Similar articles
-
iRSpot-TNCPseAAC: identify recombination spots with trinucleotide composition and pseudo amino acid components.Int J Mol Sci. 2014 Jan 24;15(2):1746-66. doi: 10.3390/ijms15021746. Int J Mol Sci. 2014. PMID: 24469313 Free PMC article.
-
iACP-GAEnsC: Evolutionary genetic algorithm based ensemble classification of anticancer peptides by utilizing hybrid feature space.Artif Intell Med. 2017 Jun;79:62-70. doi: 10.1016/j.artmed.2017.06.008. Epub 2017 Jun 17. Artif Intell Med. 2017. PMID: 28655440
-
iRSpot-DTS: Predict recombination spots by incorporating the dinucleotide-based spare-cross covariance information into Chou's pseudo components.Genomics. 2019 Dec;111(6):1760-1770. doi: 10.1016/j.ygeno.2018.11.031. Epub 2018 Dec 6. Genomics. 2019. PMID: 30529702
-
[Meiotic recombination hotspots in eukaryotes].Yi Chuan. 2005 Jul;27(4):641-50. Yi Chuan. 2005. PMID: 16120593 Review. Chinese.
-
[Molecular mechanism of homologous recombination in meiosis: origin and biological significance].Tsitologiia. 2007;49(3):182-93. Tsitologiia. 2007. PMID: 17582994 Review. Russian.
Cited by
-
RAACBook: a web server of reduced amino acid alphabet for sequence-dependent inference by using Chou's five-step rule.Database (Oxford). 2019 Jan 1;2019:baz131. doi: 10.1093/database/baz131. Database (Oxford). 2019. PMID: 31802128 Free PMC article.
-
Some illuminating remarks on molecular genetics and genomics as well as drug development.Mol Genet Genomics. 2020 Mar;295(2):261-274. doi: 10.1007/s00438-019-01634-z. Epub 2020 Jan 1. Mol Genet Genomics. 2020. PMID: 31894399 Review.
-
PhoglyStruct: Prediction of phosphoglycerylated lysine residues using structural properties of amino acids.Sci Rep. 2018 Dec 18;8(1):17923. doi: 10.1038/s41598-018-36203-8. Sci Rep. 2018. PMID: 30560923 Free PMC article.
-
iPseU-CNN: Identifying RNA Pseudouridine Sites Using Convolutional Neural Networks.Mol Ther Nucleic Acids. 2019 Jun 7;16:463-470. doi: 10.1016/j.omtn.2019.03.010. Epub 2019 Apr 11. Mol Ther Nucleic Acids. 2019. PMID: 31048185 Free PMC article.
-
Detecting Succinylation sites from protein sequences using ensemble support vector machine.BMC Bioinformatics. 2018 Jun 25;19(1):237. doi: 10.1186/s12859-018-2249-4. BMC Bioinformatics. 2018. PMID: 29940836 Free PMC article.
References
Publication types
MeSH terms
Substances
LinkOut - more resources
Full Text Sources
Miscellaneous