Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2016 Jun;26(6):826-33.
doi: 10.1101/gr.200097.115. Epub 2016 Apr 14.

Evolutionary assembly patterns of prokaryotic genomes

Affiliations

Evolutionary assembly patterns of prokaryotic genomes

Maximilian O Press et al. Genome Res. 2016 Jun.

Abstract

Evolutionary innovation must occur in the context of some genomic background, which limits available evolutionary paths. For example, protein evolution by sequence substitution is constrained by epistasis between residues. In prokaryotes, evolutionary innovation frequently happens by macrogenomic events such as horizontal gene transfer (HGT). Previous work has suggested that HGT can be influenced by ancestral genomic content, yet the extent of such gene-level constraints has not yet been systematically characterized. Here, we evaluated the evolutionary impact of such constraints in prokaryotes, using probabilistic ancestral reconstructions from 634 extant prokaryotic genomes and a novel framework for detecting evolutionary constraints on HGT events. We identified 8228 directional dependencies between genes and demonstrated that many such dependencies reflect known functional relationships, including for example, evolutionary dependencies of the photosynthetic enzyme RuBisCO. Modeling all dependencies as a network, we adapted an approach from graph theory to establish chronological precedence in the acquisition of different genomic functions. Specifically, we demonstrated that specific functions tend to be gained sequentially, suggesting that evolution in prokaryotes is governed by functional assembly patterns. Finally, we showed that these dependencies are universal rather than clade-specific and are often sufficient for predicting whether or not a given ancestral genome will acquire specific genes. Combined, our results indicate that evolutionary innovation via HGT is profoundly constrained by epistasis and historical contingency, similar to the evolution of proteins and phenotypic characters, and suggest that the emergence of specific metabolic and pathological phenotypes in prokaryotes can be predictable from current genomes.

PubMed Disclaimer

Figures

Figure 1.
Figure 1.
Workflow for deriving the PGCE network. (A) A model phylogeny and a set of gene presence/absence patterns at the tips are used to generate an ancestral reconstruction, from which gains are inferred. Filled circles represent the presence of a gene (distinguished by color), and empty circles represent absence of that gene. Inverted triangles represent points on the phylogeny where the gene of the indicated color is inferred to be gained. (B) Based on inferred gain and loss rates, many evolutionary scenarios are independently simulated and used as a null expectation for evolutionary independence. Filled circles indicate presence of the simulated gene, and empty circles indicate absence; inverted triangles represent gains of the simulated gene on the phylogeny. (C) A null distribution derived from simulated gene evolution is used to identify dependencies between real genes. (D) These dependencies are modeled as a network. Filled circles indicate genes (nodes); arrows indicate dependencies (edges).
Figure 2.
Figure 2.
PGCEs are enriched for biologically meaningful interactions. (A) The observed number of PGCE edges connecting genes in the same pathway (dashed line), compared to the expected distribution obtained from 1000 rewired networks with identical degree distributions. (B) The observed number of PGCE edges that also appear in a bacteria-wide metabolic network, compared to the expected distribution.
Figure 3.
Figure 3.
The phylogenetic history of rbsL, urtA, and rbsS. The presence of each gene in each branch in the phylogenetic tree is illustrated with a colored circle, with the circle's diameter scaled to denote the probability of presence. (A) rbsL and rbsS evolutionary histories. (B) urtA and rbsS evolutionary histories. The long branch leading to Archaea (bottommost clade) was reduced in size for graphical purposes.
Figure 4.
Figure 4.
Topological sorting of the PGCE dependency network reveals assembly patterns that govern the evolutionary process. (A) Binned dependencies among the five ranks of genes in the topological sort (left to right). Node size represents the number of genes in each ranks (using natural logarithm scale). Edge width represents the number of PGCEs between genes in different ranks (natural logarithm scale); all edges are directed to the right. (B) The gain of genes from each rank in each branch of the phylogenetic tree is illustrated (circles). The different colors represent different ranks. Circle sizes correspond to the proportion of gains on a branch attributed to genes of that rank (e.g., a large red circle indicates that most gains on a branch correspond to rank 1). The branch to Archaea (bottommost clade) has been reduced in size for graphical purposes. See also Supplemental Figure S7.
Figure 5.
Figure 5.
PGCE dependencies lead to taxonomically robust predictability of gene acquisition. (A) Workflow for predicting gene acquisition between clades of the tree. A training set is used to build a PGCE dependency model, which is then used to predict on which specific branches genes are likely to be gained (green circles), based on dependencies inferred from the training set (red and blue circles). (B) Performance of PGCEs in predicting gene acquisitions in two test sets (indicated clades of the prokaryotic tree). Areas under each curve: Firmicutes, 0.73; Alpha/Beta-proteobacteria, 0.68. The diagonal dotted line represents the performance of a purely random prediction. See also Supplemental Figure S9.

Similar articles

Cited by

References

    1. Andam CP, Gogarten JP. 2011. Biased gene transfer in microbial evolution. Nat Rev Microbiol 9: 543–555. - PubMed
    1. Andersson I, Backlund A. 2008. Structure and function of Rubisco. Plant Physiol Biochem 46: 275–291. - PubMed
    1. Baltrus DA. 2013. Exploring the costs of horizontal gene transfer. Trends Ecol Evol 28: 489–495. - PubMed
    1. Benjamini Y, Hochberg Y. 1995. Controlling the false discovery rate: a practical and powerful approach to multiple testing. J R Stat Soc B 57: 289–300.
    1. Bourgon R, Gentleman R, Huber W. 2010. Independent filtering increases detection power for high-throughput experiments. Proc Natl Acad Sci 107: 9546–9551. - PMC - PubMed

Publication types

LinkOut - more resources