GOtcha: a new method for prediction of protein function assessed by the annotation of seven genomes
- PMID: 15550167
- PMCID: PMC535938
- DOI: 10.1186/1471-2105-5-178
GOtcha: a new method for prediction of protein function assessed by the annotation of seven genomes
Abstract
Background: The function of a novel gene product is typically predicted by transitive assignment of annotation from similar sequences. We describe a novel method, GOtcha, for predicting gene product function by annotation with Gene Ontology (GO) terms. GOtcha predicts GO term associations with term-specific probability (P-score) measures of confidence. Term-specific probabilities are a novel feature of GOtcha and allow the identification of conflicts or uncertainty in annotation.
Results: The GOtcha method was applied to the recently sequenced genome for Plasmodium falciparum and six other genomes. GOtcha was compared quantitatively for retrieval of assigned GO terms against direct transitive assignment from the highest scoring annotated BLAST search hit (TOPBLAST). GOtcha exploits information deep into the 'twilight zone' of similarity search matches, making use of much information that is otherwise discarded by more simplistic approaches. At a P-score cutoff of 50%, GOtcha provided 60% better recovery of annotation terms and 20% higher selectivity than annotation with TOPBLAST at an E-value cutoff of 10(-4).
Conclusions: The GOtcha method is a useful tool for genome annotators. It has identified both errors and omissions in the original Plasmodium falciparum annotation and is being adopted by many other genome sequencing projects.
Figures









Similar articles
-
OrthoMCL: identification of ortholog groups for eukaryotic genomes.Genome Res. 2003 Sep;13(9):2178-89. doi: 10.1101/gr.1224503. Genome Res. 2003. PMID: 12952885 Free PMC article.
-
OrthoDisease: a database of human disease orthologs.Hum Mutat. 2004 Aug;24(2):112-9. doi: 10.1002/humu.20068. Hum Mutat. 2004. PMID: 15241792
-
Quantitative assessment of relationship between sequence similarity and function similarity.BMC Genomics. 2007 Jul 9;8:222. doi: 10.1186/1471-2164-8-222. BMC Genomics. 2007. PMID: 17620139 Free PMC article.
-
Transcendent elements: whole-genome transposon screens and open evolutionary questions.Genome Res. 2002 Aug;12(8):1152-5. doi: 10.1101/gr.453102. Genome Res. 2002. PMID: 12176921 Review. No abstract available.
-
Systematic genome-wide screens of gene function.Nat Rev Genet. 2004 Jan;5(1):11-22. doi: 10.1038/nrg1248. Nat Rev Genet. 2004. PMID: 14708012 Review. No abstract available.
Cited by
-
Protein function prediction by massive integration of evolutionary analyses and multiple data sources.BMC Bioinformatics. 2013;14 Suppl 3(Suppl 3):S1. doi: 10.1186/1471-2105-14-S3-S1. Epub 2013 Feb 28. BMC Bioinformatics. 2013. PMID: 23514099 Free PMC article.
-
Predictability of gene ontology slim-terms from primary structure information in Embryophyta plant proteins.BMC Bioinformatics. 2013 Feb 26;14:68. doi: 10.1186/1471-2105-14-68. BMC Bioinformatics. 2013. PMID: 23441934 Free PMC article.
-
Blinded Testing of Function Annotation for uPE1 Proteins by I-TASSER/COFACTOR Pipeline Using the 2018-2019 Additions to neXtProt and the CAFA3 Challenge.J Proteome Res. 2019 Dec 6;18(12):4154-4166. doi: 10.1021/acs.jproteome.9b00537. Epub 2019 Oct 18. J Proteome Res. 2019. PMID: 31581775 Free PMC article.
-
annot8r: GO, EC and KEGG annotation of EST datasets.BMC Bioinformatics. 2008 Apr 9;9:180. doi: 10.1186/1471-2105-9-180. BMC Bioinformatics. 2008. PMID: 18400082 Free PMC article.
-
Predicting gene ontology annotations of orphan GWAS genes using protein-protein interactions.Algorithms Mol Biol. 2014 Apr 3;9(1):10. doi: 10.1186/1748-7188-9-10. Algorithms Mol Biol. 2014. PMID: 24708602 Free PMC article.
References
-
- Ashburner M, Ball C, Blake J, Botstein D, Butler H, Cherry J, Davis A, Dolinski K, Dwight S, Eppig J, Harris M, Hill D, Issel-Tarver L, Kasarkis A, Lewis S, Matese J, Richardson J, Ringwald M, Rubin G, Sherlock G. Gene Ontology: Tool for the Unification of Biology. Nature Genetics. 2000;25:25–29. doi: 10.1038/75556. - DOI - PMC - PubMed
Publication types
MeSH terms
Substances
LinkOut - more resources
Full Text Sources
Molecular Biology Databases
Research Materials