Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2019 Dec 2;20(Suppl 16):531.
doi: 10.1186/s12859-019-3084-y.

Multimodal deep representation learning for protein interaction identification and protein family classification

Affiliations

Multimodal deep representation learning for protein interaction identification and protein family classification

Da Zhang et al. BMC Bioinformatics. .

Abstract

Background: Protein-protein interactions(PPIs) engage in dynamic pathological and biological procedures constantly in our life. Thus, it is crucial to comprehend the PPIs thoroughly such that we are able to illuminate the disease occurrence, achieve the optimal drug-target therapeutic effect and describe the protein complex structures. However, compared to the protein sequences obtainable from various species and organisms, the number of revealed protein-protein interactions is relatively limited. To address this dilemma, lots of research endeavor have investigated in it to facilitate the discovery of novel PPIs. Among these methods, PPI prediction techniques that merely rely on protein sequence data are more widespread than other methods which require extensive biological domain knowledge.

Results: In this paper, we propose a multi-modal deep representation learning structure by incorporating protein physicochemical features with the graph topological features from the PPI networks. Specifically, our method not only bears in mind the protein sequence information but also discerns the topological representations for each protein node in the PPI networks. In our paper, we construct a stacked auto-encoder architecture together with a continuous bag-of-words (CBOW) model based on generated metapaths to study the PPI predictions. Following by that, we utilize the supervised deep neural networks to identify the PPIs and classify the protein families. The PPI prediction accuracy for eight species ranged from 96.76% to 99.77%, which signifies that our multi-modal deep representation learning framework achieves superior performance compared to other computational methods.

Conclusion: To the best of our knowledge, this is the first multi-modal deep representation learning framework for examining the PPI networks.

Keywords: Knowledge graph representation learning; Multimodal deep neural network; Protein-protein interaction network.

PubMed Disclaimer

Conflict of interest statement

The authors declare that they have no competing interests.

Figures

Fig. 1
Fig. 1
Multi-modal Deep Representation Learning Framework
Fig. 2
Fig. 2
Metapath Generation
Fig. 3
Fig. 3
Protein Family Classification Deep Neural Network
Fig. 4
Fig. 4
SAE Loss for S.cerevisiae Species
Fig. 5
Fig. 5
5-CV AUC-ROC score for eight species
Fig. 6
Fig. 6
a Comparison of ACC score between our method and traditional machine learning techniques over eight species b Comparison of Recall score between our method and traditional machine learning techniques over eight species
Fig. 7
Fig. 7
Comparison of the AUC-ROC score between our method and traditional machine learning methods over eight species
Fig. 8
Fig. 8
5-CV AUC-ROC score for HPRD dataset
Fig. 9
Fig. 9
Training and Validation Accuracy
Fig. 10
Fig. 10
Protein Multi-Family Classification Micro-F1 Score
Fig. 11
Fig. 11
Protein Multi-Family Classification Macro-F1 Score

Similar articles

Cited by

References

    1. Yang L, Xia J-F, Gui J. Prediction of protein-protein interactions from protein sequence using local descriptors. Protein Pept Lett. 2010;17(9):1085–90. doi: 10.2174/092986610791760306. - DOI - PubMed
    1. Zhou Yu Zhen, Gao Yun, Zheng Ying Ying. Communications in Computer and Information Science. Berlin, Heidelberg: Springer Berlin Heidelberg; 2011. Prediction of Protein-Protein Interactions Using Local Description of Amino Acid Sequence; pp. 254–262.
    1. Guo Y, Yu L, Wen Z, Li M. Using support vector machine combined with auto covariance to predict protein–protein interactions from protein sequences. Nucleic Acids Res. 2008;36(9):3025–30. doi: 10.1093/nar/gkn159. - DOI - PMC - PubMed
    1. Creasey E. A. Yeast two-hybrid system survey of interactions between LEE-encoded proteins of enteropathogenic Escherichia coli. Microbiology. 2003;149(8):2093–2106. doi: 10.1099/mic.0.26355-0. - DOI - PubMed
    1. Ho Y, Gruhler A, Heilbut A, Bader GD, Moore L, Adams S-L, Millar A, et al. Systematic identification of protein complexes in saccharomyces cerevisiae by mass spectrometry. Nature. 2002;6868:180. doi: 10.1038/415180a. - DOI - PubMed

LinkOut - more resources