Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2009 Dec 11:10:597.
doi: 10.1186/1471-2164-10-597.

Gene socialization: gene order, GC content and gene silencing in Salmonella

Affiliations

Gene socialization: gene order, GC content and gene silencing in Salmonella

Nikolas Papanikolaou et al. BMC Genomics. .

Abstract

Background: Genes of conserved order in bacterial genomes tend to evolve slower than genes whose order is not conserved. In addition, genes with a GC content lower than the GC content of the resident genome are known to be selectively silenced by the histone-like nucleoid structuring protein (H-NS) in Salmonella.

Results: In this study, we use a comparative genomics approach to demonstrate that in Salmonella, genes whose order is not conserved (or genes without homologs) in closely related bacteria possess a significantly lower average GC content in comparison to genes that preserve their relative position in the genome. Moreover, these genes are more frequently targeted by H-NS than genes that have conserved their genomic neighborhood. We also observed that duplicated genes that do not preserve their genomic neighborhood are, on average, under less selective pressure.

Conclusions: We establish a strong association between gene order, GC content and gene silencing in a model bacterial species. This analysis suggests that genes that are not under strong selective pressure (evolve faster than others) in Salmonella tend to accumulate more AT-rich mutations and are eventually silenced by H-NS. Our findings may establish new approaches for a better understanding of bacterial genome evolution and function, using information from functional and comparative genomics.

PubMed Disclaimer

Figures

Figure 1
Figure 1
Correlation of gene order conservation with sequence identity and GC content (Salmonella vs E. coli K12). (A) Sequence identity frequency distributions of proteins encoded by GCO/nGCO genes for datasets used in this study. Boxandwhisker plots illustrate the differences between the medians and the dispersion of the respective datasets. Orange and light blue represent GCO versus nGCO datasets. Average values are displayed on top of the box plots. The two leftmost box plots (GCO, nGCO) depict differences between those two gene classes within the overall protein sequence data set. The next two data sets depict differences between duplicated GCO genes (DGCO) and duplicated nGCO genes (DnGCO). The last two datasets represent HNS repressed genes (HNSGCO: HNS repressed GCO genes, HNSnGCO: HNS repressed nGCO genes). (B) GC content of GCO genes, nGCO genes and genes with no homolog in E. coli K12 NH for datasets used in this study. Additionally to the coloring scheme of Fig. 1A, we use light grey for sequences that had no homolog in E. coli K12. The dashed horizontal line corresponds to the overall GC content of S. Typhimurium genome (52.2%). (For a more detailed description, including statistical analysis see Additional files 8, 9).
Figure 2
Figure 2
Boxandwhisker plots of Ka/Ks ratio distributions for GCO and nGCO genes for various datasets used in this study. Significant differences in Ka/Ks ratios were observed between GCO and nGCO genes both in the overall dataset (Wilcoxon ranksum test: W = 203313, Pvalue = 0, standard deviations 0.09 and 0.18 respectively) and in the subset of duplicated genes (DGCO: Duplicated GCO genes, DnGCO: Duplicated nGCO genes. W = 13595.5, Pvalue = 0, standard deviations 0.13 and 0.17 respectively). The same coloring scheme with Fig. 1A. is used for easy comparison.
Figure 3
Figure 3
Distribution of the three categories of genes (GCO, nGCO and genes with no homolog in E. coli) in the complete Salmonella gene set and in the subsets of HNS repressed genes [13]and essential genes [28]. GCO genes are overrepresented in the essential gene set and underrepresented in HNS repressed genes. In contrary, nGCO genes and genes with no homolog in E. coli are overrepresented in HNS repressed genes. As expected, we observed that genes with no homolog in E. coli are underrepresented in the essential gene set. In the random dataset we observe the same representation as in the overall Salmonella genome. The same coloring scheme with Fig. 1B. is used for easy comparison.

References

    1. Dandekar T, Snel B, Huynen M, Bork P. Conservation of gene order: a fingerprint of proteins that physically interact. Trends Biochem Sci. 1998;23(9):324–328. doi: 10.1016/S0968-0004(98)01274-2. - DOI - PubMed
    1. Theodosiou T, Iliopoulos I. Protein sequences of linked genes are highly conserved in two bacterial species. J Evol Biol. 2006;19(4):1343–1345. doi: 10.1111/j.1420-9101.2006.01093.x. - DOI - PubMed
    1. Cohen BA, Mitra RD, Hughes JD, Church GM. A computational analysis of whole-genome expression data reveals chromosomal domains of gene expression. Nat Genet. 2000;26(2):183–186. doi: 10.1038/79896. - DOI - PubMed
    1. Williams EJ, Hurst LD. The proteins of linked genes evolve at similar rates. Nature. 2000;407(6806):900–903. doi: 10.1038/35038066. - DOI - PubMed
    1. Pal C, Hurst LD. Evidence for co-evolution of gene order and recombination rate. Nat Genet. 2003;33(3):392–395. doi: 10.1038/ng1111. - DOI - PubMed

Publication types

LinkOut - more resources