Phylogenomic inference of protein molecular function: advances and challenges
- PMID: 14734307
- DOI: 10.1093/bioinformatics/bth021
Phylogenomic inference of protein molecular function: advances and challenges
Abstract
Motivation: Protein families evolve a multiplicity of functions through gene duplication, speciation and other processes. As a number of studies have shown, standard methods of protein function prediction produce systematic errors on these data. Phylogenomic analysis--combining phylogenetic tree construction, integration of experimental data and differentiation of orthologs and paralogs--has been proposed to address these errors and improve the accuracy of functional classification. The explicit integration of structure prediction and analysis in this framework, which we call structural phylogenomics, provides additional insights into protein superfamily evolution.
Results: Results of protein functional classification using phylogenomic analysis show fewer expected false positives overall than when pairwise methods of functional classification are employed. We present an overview of the motivations and fundamental principles of phylogenomic analysis, new methods developed for the key tasks, benchmark datasets for these tasks (when available) and suggest procedures to increase accuracy. We also discuss some of the methods used in the Celera Genomics high-throughput phylogenomic classification of the human genome.
Availability: Software tools from the Berkeley Phylogenomics Group are available at http://phylogenomics.berkeley.edu
Similar articles
-
Berkeley Phylogenomics Group web servers: resources for structural phylogenomic analysis.Nucleic Acids Res. 2007 Jul;35(Web Server issue):W27-32. doi: 10.1093/nar/gkm325. Epub 2007 May 8. Nucleic Acids Res. 2007. PMID: 17488835 Free PMC article.
-
Automated protein subfamily identification and classification.PLoS Comput Biol. 2007 Aug;3(8):e160. doi: 10.1371/journal.pcbi.0030160. PLoS Comput Biol. 2007. PMID: 17708678 Free PMC article.
-
On the quality of tree-based protein classification.Bioinformatics. 2005 May 1;21(9):1876-90. doi: 10.1093/bioinformatics/bti244. Epub 2005 Jan 12. Bioinformatics. 2005. PMID: 15647305
-
Key challenges in proteomics and proteoinformatics. Progress in proteins.IEEE Eng Med Biol Mag. 2005 May-Jun;24(3):34-40. doi: 10.1109/memb.2005.1436456. IEEE Eng Med Biol Mag. 2005. PMID: 15971839 Review. No abstract available.
-
Alignment-free inference of hierarchical and reticulate phylogenomic relationships.Brief Bioinform. 2019 Mar 22;20(2):426-435. doi: 10.1093/bib/bbx067. Brief Bioinform. 2019. PMID: 28673025 Free PMC article. Review.
Cited by
-
Gene function classification using Bayesian models with hierarchy-based priors.BMC Bioinformatics. 2006 Oct 12;7:448. doi: 10.1186/1471-2105-7-448. BMC Bioinformatics. 2006. PMID: 17038174 Free PMC article.
-
Automatic extraction of reliable regions from multiple sequence alignments.BMC Bioinformatics. 2007 May 24;8 Suppl 5(Suppl 5):S9. doi: 10.1186/1471-2105-8-S5-S9. BMC Bioinformatics. 2007. PMID: 17570868 Free PMC article.
-
Recent duplication and inter-locus gene conversion in major histocompatibility class II genes in a teleost, the three-spined stickleback.Immunogenetics. 2004 Sep;56(6):427-37. doi: 10.1007/s00251-004-0704-z. Epub 2004 Aug 21. Immunogenetics. 2004. PMID: 15322775
-
Protein molecular function prediction by Bayesian phylogenomics.PLoS Comput Biol. 2005 Oct;1(5):e45. doi: 10.1371/journal.pcbi.0010045. Epub 2005 Oct 7. PLoS Comput Biol. 2005. PMID: 16217548 Free PMC article.
-
Evolutionary and functional relationships within the DJ1 superfamily.BMC Evol Biol. 2004 Feb 19;4:6. doi: 10.1186/1471-2148-4-6. BMC Evol Biol. 2004. PMID: 15070401 Free PMC article.
Publication types
MeSH terms
Substances
Grants and funding
LinkOut - more resources
Full Text Sources
Other Literature Sources
Research Materials
Miscellaneous