Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2022 Dec 1;23(Suppl 7):518.
doi: 10.1186/s12859-022-04880-y.

Robust and accurate prediction of self-interacting proteins from protein sequence information by exploiting weighted sparse representation based classifier

Affiliations

Robust and accurate prediction of self-interacting proteins from protein sequence information by exploiting weighted sparse representation based classifier

Yang Li et al. BMC Bioinformatics. .

Abstract

Background: Self-interacting proteins (SIPs), two or more copies of the protein that can interact with each other expressed by one gene, play a central role in the regulation of most living cells and cellular functions. Although numerous SIPs data can be provided by using high-throughput experimental techniques, there are still several shortcomings such as in time-consuming, costly, inefficient, and inherently high in false-positive rates, for the experimental identification of SIPs even nowadays. Therefore, it is more and more significant how to develop efficient and accurate automatic approaches as a supplement of experimental methods for assisting and accelerating the study of predicting SIPs from protein sequence information.

Results: In this paper, we present a novel framework, termed GLCM-WSRC (gray level co-occurrence matrix-weighted sparse representation based classification), for predicting SIPs automatically based on protein evolutionary information from protein primary sequences. More specifically, we firstly convert the protein sequence into Position Specific Scoring Matrix (PSSM) containing protein sequence evolutionary information, exploiting the Position Specific Iterated BLAST (PSI-BLAST) tool. Secondly, using an efficient feature extraction approach, i.e., GLCM, we extract abstract salient and invariant feature vectors from the PSSM, and then perform a pre-processing operation, the adaptive synthetic (ADASYN) technique, to balance the SIPs dataset to generate new feature vectors for classification. Finally, we employ an efficient and reliable WSRC model to identify SIPs according to the known information of self-interacting and non-interacting proteins.

Conclusions: Extensive experimental results show that the proposed approach exhibits high prediction performance with 98.10% accuracy on the yeast dataset, and 91.51% accuracy on the human dataset, which further reveals that the proposed model could be a useful tool for large-scale self-interacting protein prediction and other bioinformatics tasks detection in the future.

Keywords: Gray level co-occurrence matrix; Protein sequence; Self-interacting proteins; Sparse representation.

PubMed Disclaimer

Conflict of interest statement

The authors declare that they have no competing interests.

Figures

Fig. 1
Fig. 1
Flow chart of the proposed model for predicting potential SIPs
Fig. 2
Fig. 2
The ROC and AUPR performance of WSRC-based method on yeast SIPs dataset
Fig. 3
Fig. 3
The ROC and AUPR performance of WSRC-based method on human SIPs dataset
Fig. 4
Fig. 4
The ROC and AUPR performance of SVM-based method on yeast SIPs dataset
Fig. 5
Fig. 5
The ROC and AUPR performance of SVM-based method on human SIPs dataset

References

    1. Chen Y, Dokholyan NV. Natural selection against protein aggregation on self-interacting and essential proteins in yeast, fly, and worm. Mol Biol Evol. 2008;25(8):1530–1533. doi: 10.1093/molbev/msn122. - DOI - PMC - PubMed
    1. Li Y, Wang Z, Li L-P, You Z-H, Huang W-Z, Zhan X-K, Wang Y-B. Robust and accurate prediction of protein–protein interactions by exploiting evolutionary information. Sci Rep. 2021;11(1):1–12. - PMC - PubMed
    1. Koike R, Kidera A, Ota M. Alteration of oligomeric state and domain architecture is essential for functional transformation between transferase and hydrolase with the same scaffold. Protein Sci. 2009;18(10):2060–2066. doi: 10.1002/pro.218. - DOI - PMC - PubMed
    1. Baisamy L, Jurisch N, Diviani D. Leucine zipper-mediated homo-oligomerization regulates the Rho-GEF activity of AKAP-Lbc. J Biol Chem. 2005;280(15):15405–15412. doi: 10.1074/jbc.M414440200. - DOI - PubMed
    1. Katsamba P, Carroll K, Ahlsen G, Bahna F, Vendome J, Posy S, Rajebhosale M, Price S, Jessell T, Ben-Shaul A. Linking molecular affinity and cellular specificity in cadherin-mediated adhesion. Proc Natl Acad Sci. 2009;106(28):11594–11599. doi: 10.1073/pnas.0905349106. - DOI - PMC - PubMed