Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2012 Sep 4;109(36):14504-7.
doi: 10.1073/pnas.1205683109. Epub 2012 Aug 20.

A selective force favoring increased G+C content in bacterial genes

Affiliations

A selective force favoring increased G+C content in bacterial genes

Rahul Raghavan et al. Proc Natl Acad Sci U S A. .

Abstract

Bacteria display considerable variation in their overall base compositions, which range from 13% to over 75% G+C. This variation in genomic base compositions has long been considered to be a strictly neutral character, due solely to differences in the mutational process; however, recent sequence comparisons indicate that mutational input alone cannot produce the observed base compositions, implying a role for natural selection. Because bacterial genomes have high gene content, forces that operate on the base composition of individual genes could help shape the overall genomic base composition. To explore this possibility, we tested whether genes that encode the same protein but vary only in their base compositions at synonymous sites have effects on bacterial fitness. Escherichia coli strains harboring G+C-rich versions of genes display higher growth rates, indicating that despite a pervasive mutational bias toward A+T, a selective force, independent of adaptive codon use, is driving genes toward higher G+C contents.

PubMed Disclaimer

Conflict of interest statement

The authors declare no conflict of interest.

Figures

Fig. 1.
Fig. 1.
Growth rates of isogenic strains expressing GFP genes of different G+C contents. Growth rate (OD600) was measured hourly for 5 h after induction with 1 mM IPTG. A significant association between %G+C and growth rate is observed at 2 h after induction and is evident at all later time points (0 h, r2 = 0.03; 1 h, r2 = 0.15; 2 h, r2 = 0.40; 3 h, r2 = 0.47; 4 h, r2 = 0.67; 5 h, r2 = 0.72). Asterisks denote level of significance: *P < 0.05, **P < 0.01, ***P < 0.001. Values represent the mean ± SD of three replicates. For detailed results of regressions, see Table S1.
Fig. 2.
Fig. 2.
No significant association between codon adaptation index (CAI) and growth rate. Growth of E. coli strains expressing GFP genes with CAI values ranging from 0.58 to 0.68 (calculated using CAIcal; ref. 15) were monitored by measuring OD600 every hour. Values shown were measured at 5 h after gene induction and represent the mean ± SD of three replicates.
Fig. 3.
Fig. 3.
Significant association between bacterial growth rate and base composition for genes spanning different ranges of G+C contents. Growth rates for E. coli strains expressing GFP gene variants with base compositions ranging from 40.4–53.7% G+C (A) and Bacillus phage ϕ29 DNA polymerase gene variants with base compositions ranging from 43.7–47.2% G+C (B), measured at 5 h after induction with 1mM IPTG. Values represent the mean ± SD of three replicates.
Fig. 4.
Fig. 4.
Effect of translation on the association between base composition and bacterial growth rates. Growth of E. coli strains expressing GFP gene variants of different base compositions in which translation of the GFP gene was prevented by removing the ribosome binding sites (ΔRBS) and start codons. Growth rates measured at 5 h after induction and represent the mean ± SD of three replicates.
Fig. 5.
Fig. 5.
Relationship between the average G+C content at fourfold degenerate sites of all protein-coding genes and the average G+C content of noncoding intergenic regions in fully sequenced bacterial genomes (n = 1430). Green circles denote bacteria with genome size of less than one megabase (n = 67), and the red line indicates the logistic regression model fitted to the data.

References

    1. McCutcheon JP, Moran NA. Functional convergence in reduced genomes of bacterial symbionts spanning 200 My of evolution. Genome Biol Evol. 2010;2:708–718. - PMC - PubMed
    1. Thomas SH, et al. The mosaic genome of Anaeromyxobacter dehalogenans strain 2CP-C suggests an aerobic common ancestor to the delta-proteobacteria. PLoS ONE. 2008;3:e2103. - PMC - PubMed
    1. Sueoka N. On the genetic basis of variation and heterogeneity of DNA base composition. Proc Natl Acad Sci USA. 1962;48:582–592. - PMC - PubMed
    1. Freese E. On the evolution of base composition of DNA. J Theor Biol. 1962;3:82–101.
    1. Muto A, Osawa S. The guanine and cytosine content of genomic DNA and bacterial evolution. Proc Natl Acad Sci USA. 1987;84:166–169. - PMC - PubMed

Publication types