The PhyloFacts FAT-CAT web server: ortholog identification and function prediction using fast approximate tree classification
- PMID: 23685612
- PMCID: PMC3692063
- DOI: 10.1093/nar/gkt399
The PhyloFacts FAT-CAT web server: ortholog identification and function prediction using fast approximate tree classification
Abstract
The PhyloFacts 'Fast Approximate Tree Classification' (FAT-CAT) web server provides a novel approach to ortholog identification using subtree hidden Markov model-based placement of protein sequences to phylogenomic orthology groups in the PhyloFacts database. Results on a data set of microbial, plant and animal proteins demonstrate FAT-CAT's high precision at separating orthologs and paralogs and robustness to promiscuous domains. We also present results documenting the precision of ortholog identification based on subtree hidden Markov model scoring. The FAT-CAT phylogenetic placement is used to derive a functional annotation for the query, including confidence scores and drill-down capabilities. PhyloFacts' broad taxonomic and functional coverage, with >7.3 M proteins from across the Tree of Life, enables FAT-CAT to predict orthologs and assign function for most sequence inputs. Four pipeline parameter presets are provided to handle different sequence types, including partial sequences and proteins containing promiscuous domains; users can also modify individual parameters. PhyloFacts trees matching the query can be viewed interactively online using the PhyloScope Javascript tree viewer and are hyperlinked to various external databases. The FAT-CAT web server is available at http://phylogenomics.berkeley.edu/phylofacts/fatcat/.
Figures


Similar articles
-
Berkeley PHOG: PhyloFacts orthology group prediction web server.Nucleic Acids Res. 2009 Jul;37(Web Server issue):W84-9. doi: 10.1093/nar/gkp373. Epub 2009 May 12. Nucleic Acids Res. 2009. PMID: 19435885 Free PMC article.
-
Automated protein subfamily identification and classification.PLoS Comput Biol. 2007 Aug;3(8):e160. doi: 10.1371/journal.pcbi.0030160. PLoS Comput Biol. 2007. PMID: 17708678 Free PMC article.
-
PhyloFacts: an online structural phylogenomic encyclopedia for protein functional and structural classification.Genome Biol. 2006;7(9):R83. doi: 10.1186/gb-2006-7-9-r83. Genome Biol. 2006. PMID: 16973001 Free PMC article.
-
Phylogenomic inference of protein molecular function: advances and challenges.Bioinformatics. 2004 Jan 22;20(2):170-9. doi: 10.1093/bioinformatics/bth021. Bioinformatics. 2004. PMID: 14734307 Review.
-
Ortholog identification in the presence of domain architecture rearrangement.Brief Bioinform. 2011 Sep;12(5):413-22. doi: 10.1093/bib/bbr036. Epub 2011 Jun 28. Brief Bioinform. 2011. PMID: 21712343 Free PMC article. Review.
Cited by
-
OMA standalone: orthology inference among public and custom genomes and transcriptomes.Genome Res. 2019 Jul;29(7):1152-1163. doi: 10.1101/gr.243212.118. Epub 2019 Jun 24. Genome Res. 2019. PMID: 31235654 Free PMC article.
-
A De-Novo Genome Analysis Pipeline (DeNoGAP) for large-scale comparative prokaryotic genomics studies.BMC Bioinformatics. 2016 Jun 30;17(1):260. doi: 10.1186/s12859-016-1142-2. BMC Bioinformatics. 2016. PMID: 27363390 Free PMC article.
-
Diversity in protein domain superfamilies.Curr Opin Genet Dev. 2015 Dec;35:40-9. doi: 10.1016/j.gde.2015.09.005. Epub 2015 Nov 3. Curr Opin Genet Dev. 2015. PMID: 26451979 Free PMC article. Review.
-
Comparative genomics reveals contraction in olfactory receptor genes in bats.Sci Rep. 2017 Mar 21;7(1):259. doi: 10.1038/s41598-017-00132-9. Sci Rep. 2017. PMID: 28325942 Free PMC article.
-
An introduction to the analysis of shotgun metagenomic data.Front Plant Sci. 2014 Jun 16;5:209. doi: 10.3389/fpls.2014.00209. eCollection 2014. Front Plant Sci. 2014. PMID: 24982662 Free PMC article. Review.
References
Publication types
MeSH terms
Substances
LinkOut - more resources
Full Text Sources
Other Literature Sources
Miscellaneous