Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2018 Nov 22;12(Suppl 6):109.
doi: 10.1186/s12918-018-0628-0.

Large-scale prediction of protein ubiquitination sites using a multimodal deep architecture

Affiliations

Large-scale prediction of protein ubiquitination sites using a multimodal deep architecture

Fei He et al. BMC Syst Biol. .

Abstract

Background: Ubiquitination, which is also called "lysine ubiquitination", occurs when an ubiquitin is attached to lysine (K) residues in targeting proteins. As one of the most important post translational modifications (PTMs), it plays the significant role not only in protein degradation, but also in other cellular functions. Thus, systematic anatomy of the ubiquitination proteome is an appealing and challenging research topic. The existing methods for identifying protein ubiquitination sites can be divided into two kinds: mass spectrometry and computational methods. Mass spectrometry-based experimental methods can discover ubiquitination sites from eukaryotes, but are time-consuming and expensive. Therefore, it is priority to develop computational approaches that can effectively and accurately identify protein ubiquitination sites.

Results: The existing computational methods usually require feature engineering, which may lead to redundancy and biased representations. While deep learning is able to excavate underlying characteristics from large-scale training data via multiple-layer networks and non-linear mapping operations. In this paper, we proposed a deep architecture within multiple modalities to identify the ubiquitination sites. First, according to prior knowledge and biological knowledge, we encoded protein sequence fragments around candidate ubiquitination sites into three modalities, namely raw protein sequence fragments, physico-chemical properties and sequence profiles, and designed different deep network layers to extract the hidden representations from them. Then, the generative deep representations corresponding to three modalities were merged to build the final model. We performed our algorithm on the available largest scale protein ubiquitination sites database PLMD, and achieved 66.4% specificity, 66.7% sensitivity, 66.43% accuracy, and 0.221 MCC value. A number of comparative experiments also indicated that our multimodal deep architecture outperformed several popular protein ubiquitination site prediction tools.

Conclusion: The results of comparative experiments validated the effectiveness of our deep network and also displayed that our method outperformed several popular protein ubiquitination site prediction tools. The source codes of our proposed method are available at https://github.com/jiagenlee/deepUbiquitylation .

Keywords: Convolution neural network; Deep learning; Deep neural network; Multiple modalities; Protein ubiquitination site.

PubMed Disclaimer

Conflict of interest statement

Ethics approval and consent to participate

Not applicable.

Consent for publication

Not applicable.

Competing interests

The authors declare that they have no competing interests.

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Figures

Fig. 1
Fig. 1
The structure of the proposed deep architecture
Fig. 2
Fig. 2
The accuracies of validation samples using different window sizes on three modalities
Fig. 3
Fig. 3
ROC and precision-recall curves comparing our multi-modal network and subnets of uni-modality
Fig. 4
Fig. 4
t-SNE visualization of (a) input layers and (b) merged layer
Fig. 5
Fig. 5
The ROC and precision-recall curves comparing proposed deep architecture and other protein ubiquitination site prediction tools

Similar articles

Cited by

References

    1. Goldstein G, Scheid M, Hammerling U, Schlesinger DH, Niall HD, Boyse EA. Isolation of a polypeptide that has lymphocyte-differentiating properties and is probably represented universally in living cells. Proc Natl Acad Sci U S A. 1975;72(1):11–15. doi: 10.1073/pnas.72.1.11. - DOI - PMC - PubMed
    1. Wilkinson KD. The discovery of ubiquitin-dependent proteolysis. Proc Natl Acad Sci U S A. 2005;102(43):15280–15282. doi: 10.1073/pnas.0504842102. - DOI - PMC - PubMed
    1. Welchman RL, Gordon C, Mayer RJ. Ubiquitin and ubiquitin-like proteins as multifunctional signals. Nat Rev Mol Cell Biol. 2005;6(8):599–609. doi: 10.1038/nrm1700. - DOI - PubMed
    1. Pickart CM, Eddins MJ. Ubiquitin: structures, functions. mechanisms Biochim Biophys Acta. 2004;1695(1–3):55–72. doi: 10.1016/j.bbamcr.2004.09.019. - DOI - PubMed
    1. Peng J, Schwartz D, Elias JE, Thoreen CC, Cheng D, Marsischky G, Roelofs J, Finley D, Gygi SP. A proteomics approach to understanding protein ubiquitination. Nat Biotechnol. 2003;21(8):921–926. doi: 10.1038/nbt849. - DOI - PubMed

Publication types