Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2020 Jan;34(1):1038-1051.
doi: 10.1096/fj.201901536RR. Epub 2019 Nov 28.

DNA sequence repeats identify numerous Type I restriction-modification systems that are potential epigenetic regulators controlling phase-variable regulons; phasevarions

Affiliations

DNA sequence repeats identify numerous Type I restriction-modification systems that are potential epigenetic regulators controlling phase-variable regulons; phasevarions

John M Atack et al. FASEB J. 2020 Jan.

Abstract

Over recent years several examples of randomly switching methyltransferases, associated with Type III restriction-modification (R-M) systems, have been described in pathogenic bacteria. In every case examined, changes in simple DNA sequence repeats result in variable methyltransferase expression and result in global changes in gene expression, and differentiation of the bacterial cell into distinct phenotypes. These epigenetic regulatory systems are called phasevarions, phase-variable regulons, and are widespread in bacteria, with 17.4% of Type III R-M system containing simple DNA sequence repeats. A distinct, recombination-driven random switching system has also been described in Streptococci in Type I R-M systems that also regulate gene expression. Here, we interrogate the most extensive and well-curated database of R-M systems, REBASE, by searching for all possible simple DNA sequence repeats in the hsdRMS genes that encode Type I R-M systems. We report that 7.9% of hsdS, 2% of hsdM, and of 4.3% of hsdR genes contain simple sequence repeats that are capable of mediating phase variation. Phase variation of both hsdM and hsdS genes will lead to differential methyltransferase expression or specificity, and thereby the potential to control phasevarions. These data suggest that in addition to well characterized phasevarions controlled by Type III mod genes, and the previously described Streptococcal Type I R-M systems that switch via recombination, approximately 10% of all Type I R-M systems surveyed herein have independently evolved the ability to randomly switch expression via simple DNA sequence repeats.

Keywords: R‐M systems; bacterial pathogenesis; epigenetics; phase variation; phasevarion.

PubMed Disclaimer

Conflict of interest statement

The authors declare no conflicts of interest.

Figures

Figure 1
Figure 1
A, Illustration of how phase‐variable switching of hsdS genes occurs. Type I R‐M loci are made up of three genes, encoding a restriction enzyme (hsdR), a methyltransferase (hsdM), and a target sequence specificity protein (hsdS). Each hsdS gene is made up of two target recognition domains (TRDs; TRD 1 in red, and TRD 2 in green), with the SSR tract located between the two TRDs (gray boxes). Loss or gain of repeat units in the SSR tract results in a full‐length hsdS gene being expressed (TRD 1 + 2), which produces a full‐length HsdS protein encoded in a single polypeptide (red/green oval), or a frameshift mutation downstream of the SSR tract, premature transcriptional termination, and results in a truncated HsdS polypeptide (TRD 1 only; red half‐oval). These likely dimerise via the C‐terminal coiled coil region in each truncated HsdS subunit to form a functional HsdS protein. Following oligomerization with an HsdM dimer to form an active methyltransferase, the different HsdS protein subunits result in two different methyltransferase specificities. B, schematic representation of the location of TRD 1 and TRD 2, and the SSR tracts, in a selection of hsdS loci. Colored arrows represent different genes, with color representing homology within each gene if more than one example of this gene is present in REBASE. Hatched boxes represent the locations of the each TRD. The number of different hsdS genes is noted below each species. Unique examples are listed below each individual bacterial species where this hsdS gene is present
Figure 2
Figure 2
Illustration of the phase‐variable hsdS loci present in Salmonella enterica. Salmonella enterica subsp Enterica, serovar India SA20085604 contains two hsdS loci annotated as containing G[n] tracts—S.Sen5604ORF3580 and S.Sen5604ORF16935. A, Alignment of these two genes was generated using Muscle and viewed using JalView overview feature. B, The region encoding TRD 1 is variable in the two hsdS genes (purple or green boxes) but identical in the region encoding TRD 2 (hatched box, yellow background). The remaining sequence shows high (>95% nucleotide) identity. Variation in the length of the G[n] tracts could potentially result in four different HsdS proteins in a population, represented by different colored ovals (full‐length purple/yellow oval or two half purple ovals from S.Sen5604ORF3580; full‐length green/yellow oval or two half green ovals from S.Sen5604ORF16935), that would combine with an HsdM protein (blue oval) to produce four different M2S methyltransferase variants
Figure 3
Figure 3
A, Illustration of how phase‐variable switching of extended hsdS genes occurs. Extended hsdS genes contain three separate target recognition domains (TRDs; TRD 1 in orange at the 5′ end, a central TRD 2a in light green, and TRD 2b at the 3′ end in dark green). The SSR tract (gray boxes) is located between TRD 2a and TRD 2b. A frameshift mutation through loss or gain of repeat units in the SSR tract results in TRD 2b being out of frame with the rest of the gene, and expression of a full‐length HsdS protein consisting of TRD 1 + TRD 2a, analogous to that in Figure 1. However, if the SSR tract length results in read‐through to TRD 2b, a protein made up of all three TRDs (TRD 1 + 2a + 2b) is expressed. Following oligomerization with an HsdM dimer to form an active methyltransferase, the different HsdS protein subunits result in two different methyltransferase specificities. B, schematic representation of extended hsdS loci in Mannheimia haemolytica and Fusobacterium nucleatum. Colored arrows represent different genes, with color representing homology within each gene if more than one example of this gene is present in REBASE. Hatched boxes represent the locations of each TRD. The number of different hsdS genes is noted below each species. Unique examples are listed below each individual bacterial species where this hsdS gene is present
Figure 4
Figure 4
hsdS genes containing AAGAC[n] tracts in Fusobacterium nucleatum. A, a phylogenetic tree was produced by aligning sequences using Muscle, and phylogeny analyzed by RAxML. Where the individual gene is annotated with an “S” prefix, this gene contains an AAGAC[n] repeat tract length where the S1 and S2 regions are in frame, encode the extended HsdS polypeptide, and annotated as a full‐length hsdS gene in REBASE. Where the annotation contains an “S1 + S2” suffix, the AAGAC[n] repeat tract length means that the TRD at the 3′ end of the gene (annotated as TRD 2b here) is out of frame with the 5′ end of the gene encoding TRD 1 and TRD 2a, and annotated as two separate truncated hsdS genes (S1 made up of TRD 1 + TRD 2a, and S2 made up of TRD 2b) in REBASE; B, alignments of the entire hsdS region present in REBASE showing this variation is due to the presence of multiple allelic variants of each of the three TRDs. Sequences were aligned in Muscle, and viewed using JalView overview feature
Figure 5
Figure 5
A, Illustration of how phase‐variable switching of hsdM genes occurs. Variation in the length of the SSR tract located in the hsdM ORF results in biphasic ON‐OFF switching of the hsdM gene, which results in expression of a functional HsdM protein and ensuing methyltransferase activity dependent on the HsdS subunit present, or no methyltransferase activity as the hsdM gene is nonfunctional due to a frameshift and premature transcriptional termination, with no resulting methyltransferase activity; B, schematic representation of hsdM loci containing SSR tracts. Colored arrows represent different genes, with color representing homology within each gene if more than one example of this gene is present in REBASE. Unique examples are listed below each individual bacterial species where this hsdM gene is present

Similar articles

Cited by

References

    1. Moxon R, Bayliss C, Hood D. Bacterial contingency loci: the role of simple sequence DNA repeats in bacterial adaptation. Ann Rev Genet. 2006;40:307‐333. - PubMed
    1. Ren Z, Jin H, Whitby PW, Morton DJ, Stull TL. Role of CCAA nucleotide repeats in regulation of hemoglobin and hemoglobin‐haptoglobin binding protein genes of Haemophilus influenzae . J Bacteriol. 1999;181:5865‐5870. - PMC - PubMed
    1. Richardson AR, Stojiljkovic I. HmbR, a hemoglobin‐binding outer membrane protein of Neisseria meningitidis, undergoes phase variation. J Bacteriol. 1999;181:2067‐2074. - PMC - PubMed
    1. Blyn LB, Braaten BA, Low DA. Regulation of pap pilin phase variation by a mechanism involving differential dam methylation states. EMBO J. 1990;9:4045‐4054. - PMC - PubMed
    1. Atack JM, Winter LE, Jurcisek JA, Bakaletz LO, Barenkamp SJ, Jennings MP. Selection and counter‐selection of Hia expression reveals a key role for phase‐variable expression of this adhesin in infection caused by non‐typeable Haemophilus influenzae . J Infect Dis. 2015;212:645‐653. - PMC - PubMed

Publication types

MeSH terms

LinkOut - more resources