WHISTLE server: A high-accuracy genomic coordinate-based machine learning platform for RNA modification prediction
- PMID: 34245870
- DOI: 10.1016/j.ymeth.2021.07.003
WHISTLE server: A high-accuracy genomic coordinate-based machine learning platform for RNA modification prediction
Abstract
The primary sequences of DNA, RNA and protein have been used as the dominant information source of existing machine learning tools, especially for contexts not fully explored by wet-experimental approaches. Since molecular markers are profoundly orchestrated in the living organisms, those markers that cannot be unambiguously recovered from the primary sequence often help to predict other biological events. To the best of our knowledge, there is no current tool to build and deploy machine learning models that consider genomic evidence. We therefore developed the WHISTLE server, the first machine learning platform based on genomic coordinates. It features convenient covariate extraction and model web deployment with 46 distinct genomic features integrated along with the conventional sequence features. We showed that, when predicting m6A sites from SRAMP project, the model integrating genomic features substantially outperformed those based on only sequence features. The WHISTLE server should be a useful tool for studying biological attributes specifically associated with genomic coordinates, and is freely accessible at: www.xjtlu.edu.cn/biologicalsciences/whi2.
Keywords: Epitranscriptome; Genomic coordinate; Web server.
Copyright © 2021. Published by Elsevier Inc.
Similar articles
-
WHISTLE: a high-accuracy map of the human N6-methyladenosine (m6A) epitranscriptome predicted using a machine learning approach.Nucleic Acids Res. 2019 Apr 23;47(7):e41. doi: 10.1093/nar/gkz074. Nucleic Acids Res. 2019. PMID: 30993345 Free PMC article.
-
WHISTLE: A Functionally Annotated High-Accuracy Map of Human m6A Epitranscriptome.Methods Mol Biol. 2021;2284:519-529. doi: 10.1007/978-1-0716-1307-8_28. Methods Mol Biol. 2021. PMID: 33835461
-
PSI-MOUSE: Predicting Mouse Pseudouridine Sites From Sequence and Genome-Derived Features.Evol Bioinform Online. 2020 Jun 9;16:1176934320925752. doi: 10.1177/1176934320925752. eCollection 2020. Evol Bioinform Online. 2020. PMID: 32565674 Free PMC article.
-
Comprehensive review and assessment of computational methods for predicting RNA post-transcriptional modification sites from RNA sequences.Brief Bioinform. 2020 Sep 25;21(5):1676-1696. doi: 10.1093/bib/bbz112. Brief Bioinform. 2020. PMID: 31714956 Review.
-
Prediction of bio-sequence modifications and the associations with diseases.Brief Funct Genomics. 2021 Mar 2;20(1):1-18. doi: 10.1093/bfgp/elaa023. Brief Funct Genomics. 2021. PMID: 33313647 Review.
Cited by
-
m6A-TCPred: a web server to predict tissue-conserved human m6A sites using machine learning approach.BMC Bioinformatics. 2024 Mar 25;25(1):127. doi: 10.1186/s12859-024-05738-1. BMC Bioinformatics. 2024. PMID: 38528499 Free PMC article.
-
m5CRegpred: Epitranscriptome Target Prediction of 5-Methylcytosine (m5C) Regulators Based on Sequencing Features.Genes (Basel). 2022 Apr 12;13(4):677. doi: 10.3390/genes13040677. Genes (Basel). 2022. PMID: 35456483 Free PMC article.
-
Interpretable deep cross networks unveiled common signatures of dysregulated epitranscriptomes across 12 cancer types.Mol Ther Nucleic Acids. 2024 Oct 29;35(4):102376. doi: 10.1016/j.omtn.2024.102376. eCollection 2024 Dec 10. Mol Ther Nucleic Acids. 2024. PMID: 39618823 Free PMC article.
-
Domain-knowledge enabled ensemble learning of 5-formylcytosine (f5C) modification sites.Comput Struct Biotechnol J. 2024 Aug 8;23:3175-3185. doi: 10.1016/j.csbj.2024.08.004. eCollection 2024 Dec. Comput Struct Biotechnol J. 2024. PMID: 39253057 Free PMC article.
-
DPred_3S: identifying dihydrouridine (D) modification on three species epitranscriptome based on multiple sequence-derived features.Front Genet. 2023 Dec 15;14:1334132. doi: 10.3389/fgene.2023.1334132. eCollection 2023. Front Genet. 2023. PMID: 38169665 Free PMC article.
Publication types
MeSH terms
Substances
LinkOut - more resources
Full Text Sources