BLANNOTATOR: enhanced homology-based function prediction of bacterial proteins
- PMID: 22335941
- PMCID: PMC3386020
- DOI: 10.1186/1471-2105-13-33
BLANNOTATOR: enhanced homology-based function prediction of bacterial proteins
Abstract
Background: Automated function prediction has played a central role in determining the biological functions of bacterial proteins. Typically, protein function annotation relies on homology, and function is inferred from other proteins with similar sequences. This approach has become popular in bacterial genomics because it is one of the few methods that is practical for large datasets and because it does not require additional functional genomics experiments. However, the existing solutions produce erroneous predictions in many cases, especially when query sequences have low levels of identity with the annotated source protein. This problem has created a pressing need for improvements in homology-based annotation.
Results: We present an automated method for the functional annotation of bacterial protein sequences. Based on sequence similarity searches, BLANNOTATOR accurately annotates query sequences with one-line summary descriptions of protein function. It groups sequences identified by BLAST into subsets according to their annotation and bases its prediction on a set of sequences with consistent functional information. We show the results of BLANNOTATOR's performance in sets of bacterial proteins with known functions. We simulated the annotation process for 3090 SWISS-PROT proteins using a database in its state preceding the functional characterisation of the query protein. For this dataset, our method outperformed the five others that we tested, and the improved performance was maintained even in the absence of highly related sequence hits. We further demonstrate the value of our tool by analysing the putative proteome of Lactobacillus crispatus strain ST1.
Conclusions: BLANNOTATOR is an accurate method for bacterial protein function prediction. It is practical for genome-scale data and does not require pre-existing sequence clustering; thus, this method suits the needs of bacterial genome and metagenome researchers. The method and a web-server are available at http://ekhidna.biocenter.helsinki.fi/poxo/blannotator/.
Figures






Similar articles
-
SANS: high-throughput retrieval of protein sequences allowing 50% mismatches.Bioinformatics. 2012 Sep 15;28(18):i438-i443. doi: 10.1093/bioinformatics/bts417. Bioinformatics. 2012. PMID: 22962464 Free PMC article.
-
ESG: extended similarity group method for automated protein function prediction.Bioinformatics. 2009 Jul 15;25(14):1739-45. doi: 10.1093/bioinformatics/btp309. Epub 2009 May 12. Bioinformatics. 2009. PMID: 19435743 Free PMC article.
-
ProFAT: a web-based tool for the functional annotation of protein sequences.BMC Bioinformatics. 2006 Oct 23;7:466. doi: 10.1186/1471-2105-7-466. BMC Bioinformatics. 2006. PMID: 17059594 Free PMC article.
-
Gene Ontology annotation of the rice blast fungus, Magnaporthe oryzae.BMC Microbiol. 2009 Feb 19;9 Suppl 1(Suppl 1):S8. doi: 10.1186/1471-2180-9-S1-S8. BMC Microbiol. 2009. PMID: 19278556 Free PMC article. Review.
-
Deciphering bacterial protein functions with innovative computational methods.Trends Microbiol. 2025 Apr;33(4):434-446. doi: 10.1016/j.tim.2024.11.013. Epub 2024 Dec 29. Trends Microbiol. 2025. PMID: 39736484 Review.
Cited by
-
MycoBASE: expanding the functional annotation coverage of mycobacterial genomes.BMC Genomics. 2015 Dec 24;16:1102. doi: 10.1186/s12864-015-2311-9. BMC Genomics. 2015. PMID: 26704706 Free PMC article.
-
Comparative genomics of Lactobacillus crispatus suggests novel mechanisms for the competitive exclusion of Gardnerella vaginalis.BMC Genomics. 2014 Dec 5;15:1070. doi: 10.1186/1471-2164-15-1070. BMC Genomics. 2014. PMID: 25480015 Free PMC article.
-
SFannotation: A Simple and Fast Protein Function Annotation System.Genomics Inform. 2014 Jun;12(2):76-8. doi: 10.5808/GI.2014.12.2.76. Epub 2014 Jun 30. Genomics Inform. 2014. PMID: 25031571 Free PMC article.
-
Sputum is a surrogate for bronchoalveolar lavage for monitoring Mycobacterium tuberculosis transcriptional profiles in TB patients.Tuberculosis (Edinb). 2016 Sep;100:89-94. doi: 10.1016/j.tube.2016.07.004. Epub 2016 Jul 25. Tuberculosis (Edinb). 2016. PMID: 27553415 Free PMC article.
-
SANS: high-throughput retrieval of protein sequences allowing 50% mismatches.Bioinformatics. 2012 Sep 15;28(18):i438-i443. doi: 10.1093/bioinformatics/bts417. Bioinformatics. 2012. PMID: 22962464 Free PMC article.
References
-
- Ashburner M, Ball CA, Blake JA, Botstein D, Butler H, Cherry JM, Davis AP, Dolinski K, Dwight SS, Eppig JT, Harris MA, Hill DP, Issel-Tarver L, Ka-sarskis A, Lewis S, Matese JC, Richardson JE, Ringwald M, Rubin GM, Sherlock G. Gene ontology: tool for the unification of biology. The Gene Ontology Consortium. Nat Genet. 2000;25:25–29. doi: 10.1038/75556. - DOI - PMC - PubMed
Publication types
MeSH terms
Substances
LinkOut - more resources
Full Text Sources
Research Materials