Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Review
. 2002;1(2):81-92.

Genomic biodiversity, phylogenetics and coevolution in proteins

Affiliations
Review

Genomic biodiversity, phylogenetics and coevolution in proteins

David D Pollock. Appl Bioinformatics. 2002.

Abstract

Comprehensive sampling of genomic biodiversity is fast becoming a reality for some genomic regions and complete organelle genomes. Genomic biodiversity is defined as large genomic sequences from many species, and here some recent work is reviewed that demonstrates the potential benefits of genomic biodiversity for molecular evolutionary analysis and phylogenetic reconstruction. This work shows that using likelihood-based approaches, taxon addition can dramatically improve phylogenetic reconstruction. Features or dynamics of the evolutionary process are much more easily inferred with large numbers of taxa, and large numbers are essential for discriminating differences in evolutionary patterns between sites. Accurate prediction of site-specific patterns can improve phylogenetic reconstruction by an amount equivalent to quadrupling sequence length. Genomic biodiversity is particularly central to research relating patterns of evolution, adaptation and coevolution to structural and functional features of proteins. Research on detecting coevolution between amino acid residues in proteins demonstrates a clear need for much greater numbers of closely related taxa to better discriminate site-specific patterns of interaction, and to allow more detailed analysis of coevolutionary interactions between subunits in protein complexes. It is argued that parsing out coevolutionary and other context-dependent substitution probabilities is essential for discriminating between coevolution and adaptation, and for more realistically modelling the evolution of proteins. Also reviewed is research that argues for increasing the efficiency of acquiring genomic biodiversity, and suggests that this might be done by simultaneously shotgun cloning and sequencing genomic mixtures from many species. Increased efficiency is a prerequisite if genomic biodiversity levels are to rapidly increase by orders of magnitude, and thus lead to dramatically improved understanding of interactions between protein structure, function and sequence evolution.

PubMed Disclaimer

Figures

Figure 1
Figure 1
Graphic visualisation of taxon addition. In the tree on the right, the thin grey branches leading to the newly added tips tend to be shorter than the branches on the initial tree to the left. Branches on the initial tree that are split by the addition of new taxa are necessarily shorter. The effect of taxon addition is not confounded by the differences in branch lengths and placement if the accuracies of reconstructing the same initial tree (thick black) branches are considered in both trees.
Figure 2
Figure 2
Decreasing power curve relationship between phylogenetic error and sequence length. If sequence length is N, error is approximately proportional to 32 * N−0.826. The slope is steep initially, but decreases rapidly. Between 1000 and 3000 nucleotides, the slope is relatively shallow, and the curve is nearly straight
Figure 3
Figure 3
Relationship of doppelgänger trees to focus tree of interest. The shadowy thin grey doppelgänger trees are identical in structure but evolve independently of the focus tree. This is equivalent to being connected to the focus tree by a branch with infinite length. In the bold black focus tree, only reconstruction of the short grey innermost branch was considered.

References

    1. Altschuh D, Lesk AM, Bloomer AC, Klug A. Correlation of coordinated amino acid substitutions with function in viruses related to tobacco mosaic virus. J Mol Biol. 1987;193:693–708. - PubMed
    1. Arctander P. Mitochondrial recombination? Science. 1999;284:2090–1. - PubMed
    1. Atchley WR, Wollenberg KR, Fitch WM, Terhalle W, Dress AW. Correlations among amino acid sites in bHLH protein domains: an information theoretic analysis. Mol Biol Evol. 2000;17:164–78. - PubMed
    1. Awadalla P, Eyre-Walker A, Maynard Smith J. Questioning evidence for recombination in human mitochondrial DNA: response. Science. 2000;288:1931a. - PubMed
    1. Awadalla P, Eyre-Walker A, Smith JM. Linkage disequilibrium and recombination in hominid mitochondrial DNA. Science. 1999;286:2524–5. - PubMed

Publication types

LinkOut - more resources