Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2012 Nov 13;6 Suppl 7(Suppl 7):S5.
doi: 10.1186/1753-6561-6-S7-S5. Epub 2012 Nov 13.

Evaluation of function predictions by PFP, ESG,and PSI-BLAST for moonlighting proteins

Affiliations

Evaluation of function predictions by PFP, ESG,and PSI-BLAST for moonlighting proteins

Ishita Khan et al. BMC Proc. .

Abstract

Background: Advancements in function prediction algorithms are enabling large scale computational annotation for newly sequenced genomes. With the increase in the number of functionally well characterized proteins it has been observed that there are many proteins involved in more than one function. These proteins characterized as moonlighting proteins show varied functional behavior depending on the cell type, localization in the cell, oligomerization, multiple binding sites, etc. The functional diversity shown by moonlighting proteins may have significant impact on the traditional sequence based function prediction methods. Here we investigate how well diverse functions of moonlighting proteins can be predicted by some existing function prediction methods.

Results: We have analyzed the performances of three major sequence based function prediction methods,PSI-BLAST, the Protein Function Prediction (PFP), and the Extended Similarity Group (ESG) on predicting diverse functions of moonlighting proteins. In predicting discrete functions of a set of 19 experimentally identified moonlighting proteins, PFP showed overall highest recall among the three methods. Although ESG showed the highest precision, its recall was lower than PSI-BLAST. Recall by PSI-BLAST greatly improved when BLOSUM45 was used instead of BLOSUM62.

Conclusion: We have analyzed the performances of PFP, ESG, and PSI-BLAST in predicting the functional diversity of moonlighting proteins. PFP shows overall better performance in predicting diverse moonlighting functions as compared with PSI-BLAST and ESG. Recall by PSI-BLAST greatly improved when BLOSUM45 was used. This analysis indicates that considering weakly similar sequences in prediction enhances the performance of sequence based AFP methods in predicting functional diversity of moonlighting proteins. The current study will also motivate development of novel computational frameworks for automatic identification of such proteins.

PubMed Disclaimer

Figures

Figure 1
Figure 1
Precision-Recall of PFP, ESG, and PSI- BLAST.
Figure 2
Figure 2
Recall of PFP, ESG and PSI-BLAST at each threshold. A, Recall where all the GO annotations for proteins are considered. B, Recall where only the GO annotations labeled as Function 1 or Function 2 for proteins are considered.
Figure 3
Figure 3
Recall of PFP, ESG, PSI-BLAST, PSI-BLAST with BLOSUM62 (default), BLOSUM30, and BLOSUM45 scoring matrix for each protein. Score thresholds used for the methods are PFP: 0.5, ESG: 0.35 and PSI-BLAST: 0.01 A, Recall where all the GO annotations for proteins are considered. B, Recall where only the GO annotations labeled as Function 1 or Function 2 for proteins are considered.

Similar articles

Cited by

References

    1. Hawkins T, Kihara D. Function prediction of uncharacterized proteins. Journal of bioinformatics and computational biology. 2007;5:1–30. doi: 10.1142/S0219720007002503. - DOI - PubMed
    1. Hawkins T, Chitale M, Kihara D. New paradigm in protein function prediction for large scale omics analysis. Mol BioSyst. 2008;4:223–231. doi: 10.1039/b718229e. - DOI - PubMed
    1. Altschul SF, Gish W, Miller W, Myers EW, Lipman DJ. Basic local alignment search tool. Journal of molecular biology. 1990;215:403–410. - PubMed
    1. Pearson WR. Rapid and sensitive sequence comparison with FASTP and FASTA. Methods in enzymology. 1990;183:63–98. - PubMed
    1. Bru C, Courcelle E, Carrere S, Beausse Y, Dalmar S, Kahn D. The ProDom database of protein domain families: more emphasis on 3D. Nucleic acids research. 2005;33:D212–D215. - PMC - PubMed