Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2016 Nov 15;11(11):e0166580.
doi: 10.1371/journal.pone.0166580. eCollection 2016.

Exploring Mouse Protein Function via Multiple Approaches

Affiliations

Exploring Mouse Protein Function via Multiple Approaches

Guohua Huang et al. PLoS One. .

Abstract

Although the number of available protein sequences is growing exponentially, functional protein annotations lag far behind. Therefore, accurate identification of protein functions remains one of the major challenges in molecular biology. In this study, we presented a novel approach to predict mouse protein functions. The approach was a sequential combination of a similarity-based approach, an interaction-based approach and a pseudo amino acid composition-based approach. The method achieved an accuracy of about 0.8450 for the 1st-order predictions in the leave-one-out and ten-fold cross-validations. For the results yielded by the leave-one-out cross-validation, although the similarity-based approach alone achieved an accuracy of 0.8756, it was unable to predict the functions of proteins with no homologues. Comparatively, the pseudo amino acid composition-based approach alone reached an accuracy of 0.6786. Although the accuracy was lower than that of the previous approach, it could predict the functions of almost all proteins, even proteins with no homologues. Therefore, the combined method balanced the advantages and disadvantages of both approaches to achieve efficient performance. Furthermore, the results yielded by the ten-fold cross-validation indicate that the combined method is still effective and stable when there are no close homologs are available. However, the accuracy of the predicted functions can only be determined according to known protein functions based on current knowledge. Many protein functions remain unknown. By exploring the functions of proteins for which the 1st-order predicted functions are wrong but the 2nd-order predicted functions are correct, the 1st-order wrongly predicted functions were shown to be closely associated with the genes encoding the proteins. The so-called wrongly predicted functions could also potentially be correct upon future experimental verification. Therefore, the accuracy of the presented method may be much higher in reality.

PubMed Disclaimer

Conflict of interest statement

The authors have declared that no competing interests exist.

Figures

Fig 1
Fig 1. The prediction accuracies of 24 order predictions for these three methods on the common dataset.

Similar articles

Cited by

References

    1. Erdin S, Lisewski AM, Lichtarge O. Protein function prediction: towards integration of similarity metrics. Current opinion in structural biology. 2011;21(2):180–8. 10.1016/j.sbi.2011.02.001 - DOI - PMC - PubMed
    1. Cozzetto D, Buchan DW, Bryson K, Jones DT. Protein function prediction by massive integration of evolutionary analyses and multiple data sources. BMC Bioinformatics. 2013;14 Suppl 3:S1 Epub 2013/03/27. 10.1186/1471-2105-14-s3-s1 ; PubMed Central PMCID: PMCPmc3584902. - DOI - PMC - PubMed
    1. Pandey G, Kumar V, Steinbach M. Computational Approaches for Protein Function: A Review. 2006.
    1. Zehetner G. OntoBlast function: from sequence similarities directly to potential functional annotations by ontology terms. Nucleic Acids Research. 2003;31(13):3799–803. 10.1093/nar/gkg555 - DOI - PMC - PubMed
    1. Khan S, Situ G, Decker K, Schmidt CJ. GoFigure: automated Gene Ontology annotation. Bioinformatics. 2003;19(18):2484–5. . - PubMed

LinkOut - more resources