Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2010 Mar 23;5(3):e9841.
doi: 10.1371/journal.pone.0009841.

Abundant oligonucleotides common to most bacteria

Affiliations

Abundant oligonucleotides common to most bacteria

Colin F Davenport et al. PLoS One. .

Abstract

Background: Bacteria show a bias in their genomic oligonucleotide composition far beyond that dictated by G+C content. Patterns of over- and underrepresented oligonucleotides carry a phylogenetic signal and are thus diagnostic for individual species. Patterns of short oligomers have been investigated by multiple groups in large numbers of bacteria genomes. However, global distributions of the most highly overrepresented mid-sized oligomers have not been assessed across all prokaryotes to date. We surveyed overrepresented mid-length oligomers across all prokaryotes and normalised for base composition and embedded oligomers using zero and second order Markov models.

Principal findings: Here we report a presumably ancient set of oligomers conserved and overrepresented in nearly all branches of prokaryotic life, including Archaea. These oligomers are either adenine rich homopurines with one to three guanine nucleosides, or homopyridimines with one to four cytosine nucleosides. They do not show a consistent preference for coding or non-coding regions or aggregate in any coding frame, implying a role in DNA structure and as polypeptide binding sites. Structural parameters indicate these oligonucleotides to be an extreme and rigid form of B-DNA prone to forming triple stranded helices under common physiological conditions. Moreover, the narrow minor grooves of these structures are recognised by DNA binding and nucleoid associated proteins such as HU.

Conclusion: Homopurine and homopyrimidine oligomers exhibit distinct and unusual structural features and are present at high copy number in nearly all prokaryotic lineages. This fact suggests a non-neutral role of these oligonucleotides for bacterial genome organization that has been maintained throughout evolution.

PubMed Disclaimer

Conflict of interest statement

Competing Interests: The authors have declared that no competing interests exist.

Figures

Figure 1
Figure 1. Normalised copy numbers of each oligomer.
Box and whisker plots showing the distribution of copy numbers per megabase for the 15 overrepresented oligomers for all chromosomes in which they were overrepresented. The upper end of the dashed line is the 95% confidence interval, beyond which outlier chromosomes with very high copy numbers are depicted as triangles. The lower limit is set by the lower threshold of 31 oligomer copies per megabase, i.e. twice the expected value of 15.2 for a randomly distributed octamer in one megabase. Note that GC content was previously controlled for by the zero-order and second-order Markov models used to select and verify the datasets respectively.
Figure 2
Figure 2. Localisation in coding and non-coding regions.
Localisation of abundant oligomers in coding regions and individual coding frames. The oligomer and the number of chromosomes it is found in are listed in the title of the top left graph. This histogram shows the distribution, in red, of chromosomes where this oligomer is present in coding regions (as a percentage of all occurrences of the oligomer). This histogram can be compared and contrasted with the distribution of percentage of genomic coding regions across all 684 chromosomes used in the analysis, which is presented in a blue histogram below. On the top right a box and whisker plot displays the localisation in coding regions of this oligomer across all chromosomes in which it is found, and the percentage of occurrences which are not in the translated reading frame. The remaining three scatter plots (middle right, bottom left and right) show the proportion of the oligomers in reading frames 1, 2, and 3 respectively. Frame 1 is considered “in frame”. Together, these figures demonstrate the lack of bias of these oligomers towards any particular reading frame in the chromosomes in which they are overrepresented.
Figure 3
Figure 3. Oligomers do not cluster at particular genomic positions.
Distribution of the oligomer AAGAAAAA in four genomes from the Betaproteobacteria, Gammaproteobacteria, Bacteroidetes, and Thermotogae. No distinct clusters of this oligomer are present, rather they are distributed throughout the genome. Similar distributions were also observed in other genomes.

Similar articles

Cited by

References

    1. Dobrindt U, Hochhut B, Hentschel U, Hacker J. Genomic islands in pathogenic and environmental microorganisms. Nat Rev Microbiol. 2004;2:414–424. - PubMed
    1. Karlin S. Global dinucleotide signatures and analysis of genomic heterogeneity. Curr Opin Microbiol. 1998;1:598–610. - PubMed
    1. Davenport CF, Wiehlmann L, Reva ON, Tümmler B. Visualization of Pseudomonas genomic structure by abundant 8-14mer oligonucleotides. Environ Microbiol. 2009;11:1092–1104. - PubMed
    1. Karlin S, Mrázek J, Campbell AM. Compositional biases of bacterial genomes and evolutionary implications. J Bacteriol. 1997;179:3899–3913. - PMC - PubMed
    1. Pride DT, Meinersmann RJ, Wassenaar TM, Blaser MJ. Evolutionary implications of microbial genome tetranucleotide frequency biases. Genome Res. 2003;13:145–158. - PMC - PubMed

Publication types

LinkOut - more resources