iDNA-Methyl: identifying DNA methylation sites via pseudo trinucleotide composition
- PMID: 25596338
- DOI: 10.1016/j.ab.2014.12.009
iDNA-Methyl: identifying DNA methylation sites via pseudo trinucleotide composition
Abstract
Predominantly occurring on cytosine, DNA methylation is a process by which cells can modify their DNAs to change the expression of gene products. It plays very important roles in life development but also in forming nearly all types of cancer. Therefore, knowledge of DNA methylation sites is significant for both basic research and drug development. Given an uncharacterized DNA sequence containing many cytosine residues, which one can be methylated and which one cannot? With the avalanche of DNA sequences generated during the postgenomic age, it is highly desired to develop computational methods for accurately identifying the methylation sites in DNA. Using the trinucleotide composition, pseudo amino acid components, and a dataset-optimizing technique, we have developed a new predictor called "iDNA-Methyl" that has achieved remarkably higher success rates in identifying the DNA methylation sites than the existing predictors. A user-friendly web-server for the new predictor has been established at http://www.jci-bioinfo.cn/iDNA-Methyl, where users can easily get their desired results. We anticipate that the web-server predictor will become a very useful high-throughput tool for basic research and drug development and that the novel approach and technique can also be used to investigate many other DNA-related problems and genome analysis.
Keywords: 3→1 Codon conversion; DNA methylation; Neighborhood cleaning rule; Pseudo amino acid components; Synthetic minority oversampling technique; Target–jackknife cross-validation.
Copyright © 2014 Elsevier Inc. All rights reserved.
Similar articles
-
iRNAm5C-PseDNC: identifying RNA 5-methylcytosine sites by incorporating physical-chemical properties into pseudo dinucleotide composition.Oncotarget. 2017 Jun 20;8(25):41178-41188. doi: 10.18632/oncotarget.17104. Oncotarget. 2017. PMID: 28476023 Free PMC article.
-
pRNAm-PC: Predicting N(6)-methyladenosine sites in RNA sequences via physical-chemical properties.Anal Biochem. 2016 Mar 15;497:60-7. doi: 10.1016/j.ab.2015.12.017. Epub 2015 Dec 31. Anal Biochem. 2016. PMID: 26748145
-
iDNA-Prot|dis: identifying DNA-binding proteins by incorporating amino acid distance-pairs and reduced alphabet profile into the general pseudo amino acid composition.PLoS One. 2014 Sep 3;9(9):e106691. doi: 10.1371/journal.pone.0106691. eCollection 2014. PLoS One. 2014. PMID: 25184541 Free PMC article.
-
pLoc_bal-mPlant: Predict Subcellular Localization of Plant Proteins by General PseAAC and Balancing Training Dataset.Curr Pharm Des. 2018;24(34):4013-4022. doi: 10.2174/1381612824666181119145030. Curr Pharm Des. 2018. PMID: 30451108 Review.
-
An integrated workflow for DNA methylation analysis.J Genet Genomics. 2013 May 20;40(5):249-60. doi: 10.1016/j.jgg.2013.03.010. Epub 2013 Mar 30. J Genet Genomics. 2013. PMID: 23706300 Review.
Cited by
-
Combined sequence and sequence-structure based methods for analyzing FGF23, CYP24A1 and VDR genes.Meta Gene. 2016 Mar 31;9:26-36. doi: 10.1016/j.mgene.2016.03.005. eCollection 2016 Sep. Meta Gene. 2016. PMID: 27114920 Free PMC article.
-
iMethylK_pseAAC: Improving Accuracy of Lysine Methylation Sites Identification by Incorporating Statistical Moments and Position Relative Features into General PseAAC via Chou's 5-steps Rule.Curr Genomics. 2019 May;20(4):275-292. doi: 10.2174/1389202920666190809095206. Curr Genomics. 2019. PMID: 32030087 Free PMC article.
-
2L-piRNA: A Two-Layer Ensemble Classifier for Identifying Piwi-Interacting RNAs and Their Function.Mol Ther Nucleic Acids. 2017 Jun 16;7:267-277. doi: 10.1016/j.omtn.2017.04.008. Epub 2017 Apr 13. Mol Ther Nucleic Acids. 2017. PMID: 28624202 Free PMC article.
-
Prediction of Protein Submitochondrial Locations by Incorporating Dipeptide Composition into Chou's General Pseudo Amino Acid Composition.J Membr Biol. 2016 Jun;249(3):293-304. doi: 10.1007/s00232-015-9868-8. Epub 2016 Jan 8. J Membr Biol. 2016. PMID: 26746980
-
iCpG-Pos: an accurate computational approach for identification of CpG sites using positional features on single-cell whole genome sequence data.Bioinformatics. 2023 Aug 1;39(8):btad474. doi: 10.1093/bioinformatics/btad474. Bioinformatics. 2023. PMID: 37555812 Free PMC article.
Publication types
MeSH terms
Substances
LinkOut - more resources
Full Text Sources
Other Literature Sources