CPPred: coding potential prediction based on the global description of RNA sequence
- PMID: 30753596
- PMCID: PMC6486542
- DOI: 10.1093/nar/gkz087
CPPred: coding potential prediction based on the global description of RNA sequence
Abstract
The rapid and accurate approach to distinguish between coding RNAs and ncRNAs has been playing a critical role in analyzing thousands of novel transcripts, which have been generated in recent years by next-generation sequencing technology. Previously developed methods CPAT, CPC2 and PLEK can distinguish coding RNAs and ncRNAs very well, but poorly distinguish between small coding RNAs and small ncRNAs. Herein, we report an approach, CPPred (coding potential prediction), which is based on SVM classifier and multiple sequence features including novel RNA features encoded by the global description. The CPPred can better distinguish not only between coding RNAs and ncRNAs, but also between small coding RNAs and small ncRNAs than the state-of-the-art methods due to the addition of the novel RNA features. A recent study proposes 1335 novel human coding RNAs from a large number of RNA-seq datasets. However, only 119 transcripts are predicted as coding RNAs by the CPPred. In fact, almost all proposed novel coding RNAs are ncRNAs (91.1%), which is consistent with previous reports. Remarkably, we also reveal that the global description of encoding features (T2, C0 and GC) plays an important role in the prediction of coding potential.
© The Author(s) 2019. Published by Oxford University Press on behalf of Nucleic Acids Research.
Figures




Similar articles
-
A Support Vector Machine based method to distinguish long non-coding RNAs from protein coding transcripts.BMC Genomics. 2017 Oct 18;18(1):804. doi: 10.1186/s12864-017-4178-4. BMC Genomics. 2017. PMID: 29047334 Free PMC article.
-
Advances in Computational Methodologies for Classification and Sub-Cellular Locality Prediction of Non-Coding RNAs.Int J Mol Sci. 2021 Aug 13;22(16):8719. doi: 10.3390/ijms22168719. Int J Mol Sci. 2021. PMID: 34445436 Free PMC article. Review.
-
The stacking strategy-based hybrid framework for identifying non-coding RNAs.Brief Bioinform. 2021 Sep 2;22(5):bbab023. doi: 10.1093/bib/bbab023. Brief Bioinform. 2021. PMID: 33693454
-
PLEK: a tool for predicting long non-coding RNAs and messenger RNAs based on an improved k-mer scheme.BMC Bioinformatics. 2014 Sep 19;15(1):311. doi: 10.1186/1471-2105-15-311. BMC Bioinformatics. 2014. PMID: 25239089 Free PMC article.
-
Differentiating protein-coding and noncoding RNA: challenges and ambiguities.PLoS Comput Biol. 2008 Nov;4(11):e1000176. doi: 10.1371/journal.pcbi.1000176. Epub 2008 Nov 28. PLoS Comput Biol. 2008. PMID: 19043537 Free PMC article. Review.
Cited by
-
PLEKv2: predicting lncRNAs and mRNAs based on intrinsic sequence features and the coding-net model.BMC Genomics. 2024 Aug 2;25(1):756. doi: 10.1186/s12864-024-10662-y. BMC Genomics. 2024. PMID: 39095710 Free PMC article.
-
The computational approaches of lncRNA identification based on coding potential: Status quo and challenges.Comput Struct Biotechnol J. 2020 Nov 19;18:3666-3677. doi: 10.1016/j.csbj.2020.11.030. eCollection 2020. Comput Struct Biotechnol J. 2020. PMID: 33304463 Free PMC article. Review.
-
m5U-GEPred: prediction of RNA 5-methyluridine sites based on sequence-derived and graph embedding features.Front Microbiol. 2023 Oct 23;14:1277099. doi: 10.3389/fmicb.2023.1277099. eCollection 2023. Front Microbiol. 2023. PMID: 37937221 Free PMC article.
-
MFPINC: prediction of plant ncRNAs based on multi-source feature fusion.BMC Genomics. 2024 May 30;25(1):531. doi: 10.1186/s12864-024-10439-3. BMC Genomics. 2024. PMID: 38816689 Free PMC article.
-
IRSOM2: a web server for predicting bifunctional RNAs.Nucleic Acids Res. 2023 Jul 5;51(W1):W281-W288. doi: 10.1093/nar/gkad381. Nucleic Acids Res. 2023. PMID: 37158254 Free PMC article.
References
-
- Wang Y., Li Y., Wang Q., Lv Y., Wang S., Chen X., Yu X., Jiang W., Li X.. Computational identification of human long intergenic non-coding RNAs using a GA-SVM algorithm. Gene. 2014; 533:94–99. - PubMed
Publication types
MeSH terms
Substances
LinkOut - more resources
Full Text Sources
Other Literature Sources
Molecular Biology Databases
Miscellaneous