PrUb-EL: A hybrid framework based on deep learning for identifying ubiquitination sites in Arabidopsis thaliana using ensemble learning strategy
- PMID: 36206844
- DOI: 10.1016/j.ab.2022.114935
PrUb-EL: A hybrid framework based on deep learning for identifying ubiquitination sites in Arabidopsis thaliana using ensemble learning strategy
Abstract
Identification of ubiquitination sites is central to many biological experiments. Ubiquitination is a kind of post-translational protein modification (PTM). It is a key mechanism for increasing protein diversity and plays a vital role in regulating cell function. In recent years, many models have been developed to predict ubiquitination sites in humans, mice and yeast. However, few studies have predicted ubiquitination sites in Arabidopsis thaliana. In view of this, a deep network model named PrUb-EL is proposed to predict ubiquitination sites in Arabidopsis thaliana. Firstly, six features based on the protein sequence are extracted with amino acid index database (AAindex), dipeptide deviates from the expected mean (DDE), dipeptide composition (DPC), blocks substitution matrix (BLOSUM62), enhanced amino acid composition (EAAC) and binary encoding. Secondly, the synthetic minority over-sampling technique (SMOTE) is utilized to process the imbalanced data set. Then a new classifier named DG is presented, which includes Dense block, Residual block and Gated recurrent unit (GRU) block. Finally, each of six feature extraction methods is integrated into the DG model, and the ensemble learning strategy is used to gain the final prediction result. Experimental results show that PrUb-EL has good predictive ability with the accuracy (ACC) and area under the ROC curve (auROC) values of 91.00% and 97.70% using 5-fold cross-validation, respectively. Note that the values of ACC and auROC are 88.58% and 96.09% in the independent test, respectively. Compared with previous studies, our model has significantly improved performance thus it is an excellent method for identifying ubiquitination sites in Arabidopsis thaliana. The datasets and code used for the article are available at https://github.com/Tom-Wangy/PreUb-EL.git.
Keywords: Deep learning; Dense block; Ensemble learning; GRU block; Residual block; SMOTE; Ubiquitination sites.
Copyright © 2022 Elsevier Inc. All rights reserved.
Similar articles
-
UbNiRF: A Hybrid Framework Based on Null Importances and Random Forest that Combines Multiple Features to Predict Ubiquitination Sites in Arabidopsis thaliana and Homo sapiens.Front Biosci (Landmark Ed). 2024 May 21;29(5):197. doi: 10.31083/j.fbl2905197. Front Biosci (Landmark Ed). 2024. PMID: 38812315
-
Computational identification of ubiquitination sites in Arabidopsis thaliana using convolutional neural networks.Plant Mol Biol. 2021 Apr;105(6):601-610. doi: 10.1007/s11103-020-01112-w. Epub 2021 Feb 1. Plant Mol Biol. 2021. PMID: 33527202
-
PseAraUbi: predicting arabidopsis ubiquitination sites by incorporating the physico-chemical and structural features.Plant Mol Biol. 2022 Sep;110(1-2):81-92. doi: 10.1007/s11103-022-01288-3. Epub 2022 Jul 1. Plant Mol Biol. 2022. PMID: 35773617
-
Mini-review: Recent advances in post-translational modification site prediction based on deep learning.Comput Struct Biotechnol J. 2022 Jun 30;20:3522-3532. doi: 10.1016/j.csbj.2022.06.045. eCollection 2022. Comput Struct Biotechnol J. 2022. PMID: 35860402 Free PMC article. Review.
-
Analysis and review of techniques and tools based on machine learning and deep learning for prediction of lysine malonylation sites in protein sequences.Database (Oxford). 2024 Jan 19;2024:baad094. doi: 10.1093/database/baad094. Database (Oxford). 2024. PMID: 38245002 Free PMC article. Review.
Cited by
-
OnmiMHC: a machine learning solution for UCEC tumor vaccine development through enhanced peptide-MHC binding prediction.Front Immunol. 2025 Feb 28;16:1550252. doi: 10.3389/fimmu.2025.1550252. eCollection 2025. Front Immunol. 2025. PMID: 40092998 Free PMC article.
-
KD_MultiSucc: incorporating multi-teacher knowledge distillation and word embeddings for cross-species prediction of protein succinylation sites.Biol Methods Protoc. 2025 May 28;10(1):bpaf041. doi: 10.1093/biomethods/bpaf041. eCollection 2025. Biol Methods Protoc. 2025. PMID: 40585181 Free PMC article.
Publication types
MeSH terms
Substances
LinkOut - more resources
Full Text Sources