Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2010 Apr 29:11:215.
doi: 10.1186/1471-2105-11-215.

PoGO: Prediction of Gene Ontology terms for fungal proteins

Affiliations

PoGO: Prediction of Gene Ontology terms for fungal proteins

Jaehee Jung et al. BMC Bioinformatics. .

Abstract

Background: Automated protein function prediction methods are the only practical approach for assigning functions to genes obtained from model organisms. Many of the previously reported function annotation methods are of limited utility for fungal protein annotation. They are often trained only to one species, are not available for high-volume data processing, or require the use of data derived by experiments such as microarray analysis. To meet the increasing need for high throughput, automated annotation of fungal genomes, we have developed a tool for annotating fungal protein sequences with terms from the Gene Ontology.

Results: We describe a classifier called PoGO (Prediction of Gene Ontology terms) that uses statistical pattern recognition methods to assign Gene Ontology (GO) terms to proteins from filamentous fungi. PoGO is organized as a meta-classifier in which each evidence source (sequence similarity, protein domains, protein structure and biochemical properties) is used to train independent base-level classifiers. The outputs of the base classifiers are used to train a meta-classifier, which provides the final assignment of GO terms. An independent classifier is trained for each GO term, making the system amenable to updating, without having to re-train the whole system. The resulting system is robust. It provides better accuracy and can assign GO terms to a higher percentage of unannotated protein sequences than other methods that we tested.

Conclusions: Our annotation system overcomes many of the shortcomings that we found in other methods. We also provide a web server where users can submit protein sequences to be annotated.

PubMed Disclaimer

Figures

Figure 1
Figure 1
Overview of the PoGO training and evaluation procedure. The PoGO classifier uses a Combiner configuration in which two base-level classifiers are trained on 45% of the training proteins and evaluated on another 45% of the training proteins. The remaining 10% of the training proteins are used to evaluate the meta-classifier. This configuration is repeated for each of the Gene Ontology terms trained in PoGO.
Figure 2
Figure 2
Flowchart of the PoGO web server. Four different sequence analysis programs converts data to InterPro term, Blast result, Bio-chemical property and protein structure information, which is represented by the gray and black box. After the transformed data applied to the PoGO training model, we can get the final GO annotation and its supplementary information in each query protein.

Similar articles

Cited by

References

    1. Cantarel BL, Korf I, Robb SMC, Parra G, Ross E, Moore B, Holt C, Sánchez Alvarado A, Yandell M. MAKER: an easy-to-use annotation pipeline designed for emerging model organism genomes. Genome Res. 2008;18(1):188–196. doi: 10.1101/gr.6743907. - DOI - PMC - PubMed
    1. Altschul SF, Madden TL, Schaffer AA, Zhang J, Zhang Z, Miller W, Lipman DJ. Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res. 1997;25(17):3389–3402. doi: 10.1093/nar/25.17.3389. - DOI - PMC - PubMed
    1. King RD, Karwath A, Clare A, Dephaspe L. Genome scale prediction of protein functional class from sequence using data mining. Proc of the sixth ACM SIGKDD Inter Conf on Knowledge discovery and data mining. 2003.
    1. Pellegrini M, Marcotte EM, Thompson MJ, Eisenberg D, Yeates TO. Assigning protein functions by comparative genome analysis: Protein phylogenetic profiles. Proc Natl Acad Sci USA. 1999;96(8):4285–4288. doi: 10.1073/pnas.96.8.4285. - DOI - PMC - PubMed
    1. Ranea JAG, Yeats C, Grant A, Orengo CA. Predicting Protein Function with Hierarchical Phylogenetic Profiles: The Gene3D Phylo-Tuner Method Applied to Eukaryotic Genomes. PLoS Comput Biol. 2007;3(11):e237. doi: 10.1371/journal.pcbi.0030237. - DOI - PMC - PubMed

Publication types

Substances

LinkOut - more resources