Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2016 Jan 15:93:84-91.
doi: 10.1016/j.ymeth.2015.09.011. Epub 2015 Sep 11.

Integrated protein function prediction by mining function associations, sequences, and protein-protein and gene-gene interaction networks

Affiliations

Integrated protein function prediction by mining function associations, sequences, and protein-protein and gene-gene interaction networks

Renzhi Cao et al. Methods. .

Abstract

Motivations: Protein function prediction is an important and challenging problem in bioinformatics and computational biology. Functionally relevant biological information such as protein sequences, gene expression, and protein-protein interactions has been used mostly separately for protein function prediction. One of the major challenges is how to effectively integrate multiple sources of both traditional and new information such as spatial gene-gene interaction networks generated from chromosomal conformation data together to improve protein function prediction.

Results: In this work, we developed three different probabilistic scores (MIS, SEQ, and NET score) to combine protein sequence, function associations, and protein-protein interaction and spatial gene-gene interaction networks for protein function prediction. The MIS score is mainly generated from homologous proteins found by PSI-BLAST search, and also association rules between Gene Ontology terms, which are learned by mining the Swiss-Prot database. The SEQ score is generated from protein sequences. The NET score is generated from protein-protein interaction and spatial gene-gene interaction networks. These three scores were combined in a new Statistical Multiple Integrative Scoring System (SMISS) to predict protein function. We tested SMISS on the data set of 2011 Critical Assessment of Function Annotation (CAFA). The method performed substantially better than three base-line methods and an advanced method based on protein profile-sequence comparison, profile-profile comparison, and domain co-occurrence networks according to the maximum F-measure.

Keywords: Chromosome conformation capturing; Data integration; Protein function prediction; Protein–protein interaction network; Spatial gene–gene interaction network.

PubMed Disclaimer

Figures

Fig. 1
Fig. 1
The overall flowchart of our method.
Fig. 2
Fig. 2
The performance comparison for MIS, SEQ, and SMISS using scaled technique benchmarked on CAFA1. X-axis shows the recall of the prediction, and y-axis shows the precision of the prediction. (A) The performance of original MIS score and the score with score scaling technique start from 1 or max. (B) The performance of MIS score, original SEQ score, and the scaled SEQ score. (C) The comparison between MIS predictor and SMISS predictor.
Fig. 3
Fig. 3
The performance of our SMISS with three standard baseline method and three predictors from an automated three-level method. Prediction 57, 58, 59 is the standard baseline method, and Predictors 1, 2, 3 is three predictors from an automated three-level method. X-axis shows the recall for each predictor, and y-axis shows the precision for each predictor.

References

    1. Radivojac P, Clark WT, Oron TR, Schnoes AM, Wittkop T, Sokolov A, Graim K, Funk C, Verspoor K, Ben-Hur A. Nat. Methods. 2013;10:221–227. - PMC - PubMed
    1. Liolios K, Chen I-MA, Mavromatis K, Tavernarakis N, Hugenholtz P, Markowitz VM, Kyrpides NC. Nucleic Acids Res. 2010;38:D346–D354. - PMC - PubMed
    1. Ashburner M, Ball C, Blake J, Botstein D, Butler H, Cherry J, Davis A, Dolinski K, Dwight S, Eppig J. Nat. Genet. 2000;25:25–29. - PMC - PubMed
    1. Rost B, Liu J, Nair R, Wrzeszczynski KO, Ofran Y. Cell. Mol. Life Sci. 2003;60:2637–2650. - PMC - PubMed
    1. Watson JD, Laskowski RA, Thornton JM. Curr. Opin. Struct. Biol. 2005;15:275–284. - PubMed

Publication types

MeSH terms

LinkOut - more resources