FlowerPower: clustering proteins into domain architecture classes for phylogenomic inference of protein function
- PMID: 17288570
- PMCID: PMC1796606
- DOI: 10.1186/1471-2148-7-S1-S12
FlowerPower: clustering proteins into domain architecture classes for phylogenomic inference of protein function
Abstract
Background: Function prediction by transfer of annotation from the top database hit in a homology search has been shown to be prone to systematic error. Phylogenomic analysis reduces these errors by inferring protein function within the evolutionary context of the entire family. However, accuracy of function prediction for multi-domain proteins depends on all members having the same overall domain structure. By contrast, most common homolog detection methods are optimized for retrieving local homologs, and do not address this requirement.
Results: We present FlowerPower, a novel clustering algorithm designed for the identification of global homologs as a precursor to structural phylogenomic analysis. Similar to methods such as PSIBLAST, FlowerPower employs an iterative approach to clustering sequences. However, rather than using a single HMM or profile to expand the cluster, FlowerPower identifies subfamilies using the SCI-PHY algorithm and then selects and aligns new homologs using subfamily hidden Markov models. FlowerPower is shown to outperform BLAST, PSI-BLAST and the UCSC SAM-Target 2K methods at discrimination between proteins in the same domain architecture class and those having different overall domain structures.
Conclusion: Structural phylogenomic analysis enables biologists to avoid the systematic errors associated with annotation transfer; clustering sequences based on sharing the same domain architecture is a critical first step in this process. FlowerPower is shown to consistently identify homologous sequences having the same domain architecture as the query.
Availability: FlowerPower is available as a webserver at http://phylogenomics.berkeley.edu/flowerpower/.
Figures





Similar articles
-
Berkeley Phylogenomics Group web servers: resources for structural phylogenomic analysis.Nucleic Acids Res. 2007 Jul;35(Web Server issue):W27-32. doi: 10.1093/nar/gkm325. Epub 2007 May 8. Nucleic Acids Res. 2007. PMID: 17488835 Free PMC article.
-
Automated protein subfamily identification and classification.PLoS Comput Biol. 2007 Aug;3(8):e160. doi: 10.1371/journal.pcbi.0030160. PLoS Comput Biol. 2007. PMID: 17708678 Free PMC article.
-
Automatic annotation of protein function based on family identification.Proteins. 2003 Nov 15;53(3):683-92. doi: 10.1002/prot.10449. Proteins. 2003. PMID: 14579359
-
Phylogenomic inference of protein molecular function: advances and challenges.Bioinformatics. 2004 Jan 22;20(2):170-9. doi: 10.1093/bioinformatics/bth021. Bioinformatics. 2004. PMID: 14734307 Review.
-
Exploring plant protein functions through structure-based clustering.Trends Plant Sci. 2025 Apr 15:S1360-1385(25)00091-3. doi: 10.1016/j.tplants.2025.03.014. Online ahead of print. Trends Plant Sci. 2025. PMID: 40240260 Review.
Cited by
-
Protein domain recurrence and order can enhance prediction of protein functions.Bioinformatics. 2012 Sep 15;28(18):i444-i450. doi: 10.1093/bioinformatics/bts398. Bioinformatics. 2012. PMID: 22962465 Free PMC article.
-
Evaluation of function predictions by PFP, ESG,and PSI-BLAST for moonlighting proteins.BMC Proc. 2012 Nov 13;6 Suppl 7(Suppl 7):S5. doi: 10.1186/1753-6561-6-S7-S5. Epub 2012 Nov 13. BMC Proc. 2012. PMID: 23173871 Free PMC article.
-
Threshold Average Precision (TAP-k): a measure of retrieval designed for bioinformatics.Bioinformatics. 2010 Jul 15;26(14):1708-13. doi: 10.1093/bioinformatics/btq270. Epub 2010 May 26. Bioinformatics. 2010. PMID: 20505002 Free PMC article.
-
Reassessing domain architecture evolution of metazoan proteins: major impact of errors caused by confusing paralogs and epaktologs.Genes (Basel). 2011 Aug 2;2(3):516-61. doi: 10.3390/genes2030516. Genes (Basel). 2011. PMID: 24710209 Free PMC article.
-
A modern ionotropic glutamate receptor with a K(+) selectivity signature sequence.Nat Commun. 2011;2:232. doi: 10.1038/ncomms1231. Nat Commun. 2011. PMID: 21407198
References
-
- Eisen JA. Phylogenomics: improving functional predictions for uncharacterized genes by evolutionary analysis. Genome Res. 1998;8:163–167. - PubMed
-
- Galperin MY, Koonin EV. Sources of systematic error in functional annotation of genomes: domain rearrangement, non-orthologous gene displacement and operon disruption. In Silico Biol. 1998;1:55–67. - PubMed
Publication types
MeSH terms
Substances
Grants and funding
LinkOut - more resources
Full Text Sources
Research Materials