Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Review
. 2008 Oct 30;9(10):235.
doi: 10.1186/gb-2008-9-10-235.

Large-scale assignment of orthology: back to phylogenetics?

Affiliations
Review

Large-scale assignment of orthology: back to phylogenetics?

Toni Gabaldón. Genome Biol. .

Abstract

Reliable orthology prediction is central to comparative genomics. Although orthology is defined by phylogenetic criteria, most automated prediction methods are based on pairwise sequence comparisons. Recently, automated phylogeny-based orthology prediction has emerged as a feasible alternative for genome-wide studies.

PubMed Disclaimer

Figures

Figure 1
Figure 1
p53 phylogeny. Phylogenetic tree representing the evolutionary relationships among p53 and related proteins. Sequences were obtained from the p53 tree at phylomeDB [35] (entry code Hsa0012331). After selecting a group of representative sequences, a maximum likelihood tree was reconstructed using the same parameters used for the JTT tree in PhylomeDB. Shaded boxes indicate vertebrate members of the p53, p73 and p73L subfamilies. Duplication nodes are marked with a gray circle. The arrow indicates the speciation node that marks the bifurcation between urochordates and vertebrates.
Figure 2
Figure 2
Orthology prediction methods. (a-c) Pairwise-based and (d, e) phylogeny-based methods. Circles of different colors indicate proteins encoded in genomes from different species. Black arrows represent reciprocal BLAST hits. Proteins within dashed ovals are predicted by the method to belong to the same orthologous group. (a) Best bi-directional hit (BBH). All pairs of proteins with reciprocal best hits are considered orthologs. Note that this method is unable to predict the othology with the yellow protein 2. (b) COG-like approach. Proteins in the nodes of triangular networks of BBHs are considered as orthologs (green, red and yellow protein 1 in the example). New proteins are added to the orthologous group if they are present in BBH triangles that share an edge with a given cluster; for example, the gray protein will be added to the orthologous group because it forms a BBH triangle with the red and green proteins. Note that a BBH link with yellow protein 1 is not required. The COG-like approach can add additional proteins from the same genome if they are more similar to each other than to proteins in other genomes, or if they form BBH triangles with members of the cluster. This is not the case for yellow protein 2, which is, again, misclassified. (c) Inparanoid approach. This is similar to (a), but other proteins within a proteome (yellow protein 2 in this example) are included as 'in-paralogs' if they are more similar to each other than to their corresponding hits in the other species. (d) Tree-reconciliation phylogenetic approach. Duplication nodes (marked with a D) are defined by comparing the gene tree (small tree at the top) with the species tree (small tree at the bottom) to derive a reconciled tree (big tree on the right) in which the minimal number of duplication and gene loss (dashed lines) events necessary to explain the gene tree are included. In this case, both the yellow proteins are included in the orthologous group but the red and gray proteins are excluded. (e) Species-overlap phylogenetic approach. All proteins that derive from a common ancestor by speciation are considered members of the same orthologous group. Duplication nodes are detected when they define partitions with at least one shared species. A one-to-many orthology relationship emerges because of a recent duplication in the lineage leading to the yellow proteome.

References

    1. Fitch WM. Distinguishing homologous from analogous proteins. Syst Zool. 1970;19:99–113. doi: 10.2307/2412448. - DOI - PubMed
    1. Moreira D, Philippe H. Molecular phylogeny: pitfalls and progress. Int Microbiol. 2000;3:9–16. - PubMed
    1. Gabaldón T. Evolution of proteins and proteomes, a phylogenetics approach. Evol Bioinf Online. 2005;1:51–56. - PMC - PubMed
    1. Gabaldón T, Huynen MA. Prediction of protein function and pathways in the genome era. Cell Mol Life Sci. 2004;61:930–944. doi: 10.1007/s00018-003-3387-y. - DOI - PMC - PubMed
    1. Huynen MA, Gabaldón T, Snel B. Variation and evolution of biomolecular systems: searching for functional relevance. FEBS Lett. 2005;579:1839–1845. doi: 10.1016/j.febslet.2005.02.004. - DOI - PubMed

Publication types

LinkOut - more resources