Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Review
. 2025 Dec 16;13(1):eesp00142022.
doi: 10.1128/ecosalplus.esp-0014-2022. Epub 2025 Aug 26.

Biology of host-dependent restriction-modification in prokaryotes

Affiliations
Review

Biology of host-dependent restriction-modification in prokaryotes

Brian P Anton et al. EcoSal Plus. .

Abstract

Understanding the mechanisms that modulate horizontal genetic exchange in prokaryotes is a key problem in biology. DNA entry is limited by resident host-dependent restriction-modification (RM) systems (HDRM), which are present in most prokaryotic genomes. This review specifically focuses on the biological functions of HDRM, rather than detailed enzyme mechanisms. DNA in each cell carries epigenetic marks imposed by host-modifying enzymes (HDM), most often not only base methylation but also additions to the phosphodiester backbone. The pattern of base and backbone modifications is read by host-restriction enzymes (HDR). Broadly, HDRM systems read the pattern of chemical modifications to DNA at host-determined (HD) sites to regulate the fate of incoming mobile DNA. An inappropriate pattern may be restricted either due to the absence of protective modification or its presence; the latter activity is mediated by modification-dependent restriction enzymes (MDRE). Most often, restriction occurs via nuclease-mediated degradation, but it can also act via other mechanisms that prevent the initiation of replication. Like other genome-defense systems, HDRM systems are highly diverse and somewhat modular. The basic functions required for action in vivo and the protein domains responsible for each function are addressed here. Particularly under-studied among the latter are the interaction domains that control the launch of highly toxic activities such as HDR. These have been evolutionarily shuffled to build a variety of classical RM systems as well as more divergent systems.

Keywords: DNA methylation; DNA phosporothioation; anti-phage; bacteriophages; defense islands; genome defense; host-dependent restriction-modification; mobile DNA; restriction endonuclease.

PubMed Disclaimer

Conflict of interest statement

Brian P. Anton, James Eaglesham, Richard J. Roberts, Shuang-yong Xu, Peter R. Weigele, and Elisabeth A. Raleigh are/were employees of New England Biolabs, Inc.

Figures

Fig 1
Fig 1
Host-dependent DNA modifications of prokaryotes. Host-dependent nucleobase and backbone modifications are used by cells to distinguish self from non-self. The top row illustrates the canonical nucleobases modified by HDRM systems. The second row illustrates their modified derivatives with the added chemical groups indicated in red. The bottom row shows a canonical phosphodiester backbone (left) and its Rp-phosphorothioate derivative (right) containing sulfur replacing the non-bridging oxygen.
Fig 2
Fig 2
Pattern reading: inherited DNA modification patterns direct HDRM action. (A) Endogenous chromosomal DNA is modified by an MTase (HDM) and thus protected from the partner REase (HDR), whereas unmodified phage DNA infecting a new host is restricted. (B) Modification that protects from one REase may cause sensitivity to another. M.AluI methylates the cytosine of AGCT to protect against R.AluI. Type IV MDRE McrBC cleaves such modified DNA when two RmC are appropriately spaced.
Fig 3
Fig 3
Functions that enable RM and composition of system types. Colored boxes represent protein domains with distinct functions (box on the lower right): site recognition specificity, modification catalysis, DNA cleavage, regulatory protein-protein interaction, and NTPase to facilitate scanning DNA for sites. Functions may be fulfilled in multiple ways. Distinct solutions may be recognized bioinformatically, as by Pfam signatures (Pfam is a database of protein families that includes their annotations and multiple sequence alignments generated using hidden Markov models; non-exhaustive examples in parentheses). Four families in which HDR activity acts on unmodified DNA are shown on the left, whereas another four families shown on the right attack when a sensitizing modification is present: collectively, these are Type IV systems (MDRE). See text for elaboration.
Fig 4
Fig 4
Regulatory mechanisms of Type IIP HDRM. (A) Regulation by promoter methylation, where the coordinated expression of an RM system relies on the methylation status of the promoter of the MTase gene (in the case of CfrBI localized within inverted repeats), amid two divergent overlapping promoters for the REase (PR) and MTase (PM). In this system, MTase gene transcription is repressed as the accumulation of methylation in its promoter decreases RNA polymerase (RNAP) binding, allowing more transcription from the competing REase promoter. The putative cruciform structure formation may limit REase expression by terminating transcription from the REase promoter. (B) Operator binding by Regulatory MTase. Regulatory MTases are translationally fused to a sequence-specific DNA-binding protein with a helix-turn-helix (HTH) domain at its N-terminus. A binding site for the HTH domain is present between the overlapping REase and MTase promoters. The MTase HTH competes with RNAP for PM binding, leading to MTase auto-repression and increased availability of PR to RNAP. (C) Regulatory antisense RNAs are driven from two strong reverse promoters, located within the REase gene. These act by inhibiting both PR and PM, via negative feedback loops. In addition, resulting mRNA/aRNA duplexes are susceptible to RNase degradation, leading to decreased mRNA levels. (D) Regulation by dedicated transcription factors. C (control) proteins activate PCR transcription at low C concentrations and repress it at higher levels, via autogenous feedback loops. C proteins bind and distort the DNA operator sequence within its own promoter via HTH motifs, forming homodimers (activators) or tetramers (repressors). Abbreviations: M, modification enzyme gene; R, restriction endonuclease gene; C, control protein; aRNA, antisense RNA; PR, promoter of REase; PM promoter of MTase, PCR, shared promoter of C gene and REase, SD, Shine-Dalgarno sequence.
Fig 5
Fig 5
Protection of asymmetric restriction targets during replication. (A) A replication fork moving from left to right lays down a new strand (orange arrows) instructed by the parental strand (blue arrows). Recognition sites for RM action on the parental strand are already modified (red stars), but sites on the daughter strand are not yet modified (green open stars). (B) M.EcoP15I methylates (5’ CAGCAG 3’) on only one strand; restriction requires two unmodified sites on opposite strands. When two successive sites are head to tail, existing modification protects each of the replicated endogenous copies from nuclease action, while invading unmodified DNA is sensitive.
Fig 6
Fig 6
Type I HDRM system protein components, DNA sites, and active assemblies. Top row: protein assemblies competent for different actions; bottom row: individual components that contribute to active assemblies. Components were abstracted from structural visualizations in Fig. 1 and 2 of (146). The complexes there were modeled from crystal structures of EcoKI HsdM (M1, M2), HsdS (S) and DNA, and EcoR124 HsdR (R1, R2). Overall organization and enzyme mechanism are similar. Scale of the M2:S:DNA complex is approximate as presented for M.EcoKI in (146). Type I S proteins exhibit high variability in the Target Recognition Domains (TRD; green and orange blobs), which recognize distinct DNA sequences (green and orange highlights on orange helix). TRDs are connected by conserved helical regions (green and orange bars). Sites comprise 7-8 specific nucleotides (green and orange dots on the DNA structural visualization), with nonspecific spacer. Protection is conferred by methylation of adenine (bold green and orange in the sequence). Catalytic components of HsdR include the nuclease domain (pink region) and two translocase domains (T1 and T2; violet region). Relative scale as between the two figures of (146) has been roughly adjusted to fit the figure here, recalling that HsdR of EcoR124I is 1033 aa, whereas HsdM of EcoKI is 529 aa. Above: Active assemblies. M2S1 is an active methyltransferase while R2M2S1 can either modify a hemi-methylated site or engage DNA to translocate and cleave. For both EcoKI and EcoR124I, an unmodified recognition site triggers translocation (not shown) of flanking DNA by HsdR/T, with R2M2S remaining fixed at the unmodified site. EcoR124I translocates both as R2M2S1 and R1M2S1. R.EcoKI has a required, protease-sensitive N-terminal extension not found on R.EcoR124I. DNA binding by S proteins requires M2. Not shown are cofactors S-adenosylmethionine (SAM), Mg++ (both required for all activities), and ATP (required for translocation and cleavage).
Fig 7
Fig 7
Structure and topology of PUA and PUA-like domains. Top row: cartoon domains rendered in rainbow colors (N red- to-C violet); bottom row: associated domain topology renderings. Left to right: 5mC-restricting TagI endonuclease residues 1-164. TagI PDB: 6GHS.ee, with SRA domain topology; VcaM4I endonuclease restricts 5hmC and 5mC (rainbow, residues 1-147) PDB: 6YEX, with EVE domain topology: Yth domain from Tg∆185, a naturally-occurring Yth-McrB fusion that binds to 6mA modified DNA. PDB: 6P0F, with Yth domain topology. E. coli protein YqfB, an N4-acetylcytidine amidohydrolase that hydrolyzes various N4-acylated cytosines (4acC) and cytidines. YqfB PDB: 1TE7, with ASCH topology; Archaeosine tRNA-guanine transglycosylase (Arc-TGT). PDB: 1J2B. The PUA domain consists of 67–94 amino acid residues with 4–7 β-strands architecture and a few α–helix insertions. In this example, the PUA fold is composed of six β-strands coiled to form a pseudobarrel, also known as a folded β-sandwich with PUA domain topology.
Fig 8
Fig 8
BREX gene clusters discussed. Block arrows represent genes that are present in BREX clusters, labeled with the gene name (A = brxA, B = brxB). Numbers identify strains listed at lower right. Each row corresponds to a set of genes found in one or more characterized BREX clusters; all five examples carry homologs in blue. Two plasmid-borne examples carry brxR (green). Extraneous genes not involved in BREX activity express a Type IV MDRE enzyme (red) or two genes specifying another antiphage activity (orange). Functional roles are unclear except for BrxR, shown to repress transcription; and PglX, with methyltransferase catalytic signatures. The five dark blue genes are required for both modification and restriction where tested; light blue genes (brxL) or segments (brxC C-terminus) are required only for restriction where tested.
Fig 9
Fig 9
Biological phosphorothioate formation. (A) DNA sites modified by Dnd and Ssp systems. Dnd systems replace nonbridging oxygen in the phosphate moiety (p) with sulfur (s) preferentially in GA or GT dinucleotides. When both strands are modified, the consensus 5‘GAAC/5’GTTC becomes 5’GpsAApC/5'GpsTTpC. The resulting linkage is shown at the bottom right of panel B. Ssp systems modify only one strand (5' CpsC 3’), using a reader complex distinct from the double-strand capable Dnd systems. (B) Flow of sulfur (S) as a series of disulfide reduction steps begins with cysteine and finally transfers to DNA. Transfer is mediated by related or analogous activities at each step. The final sulfur donor (DndC or SspD) collaborates with alternative DNA nicking-ATPase components to provide energy and possibly substrate stabilization.
Fig 10
Fig 10
The enteric immigration control region (ICR) exemplifies a defense island. Gene diagrams of the ICR from select enterobacterial strains. The gene content between the conserved boundary genes yjiA and yjiN is highly variable across the Enterobacteriaceae. Most loci include genome defense functions, including three families of Type I restriction endonucleases (hsdRMS) and Type IV modification-dependent restriction endonucleases (e.g., mrr, mcrBC, and gmrSD homologs). Apparent non-genome defense functions encoded in the ICR such as hydrolytic enzymes (penicillinase), metabolite reductases (aldehyde dehydrogenase), and small molecule transporters (mtdM) encoded in this locus may have additional fitness effects.
Fig 11
Fig 11
Hotspot (HS)37 extended to Enterobacteriacaeae. Gene diagrams of hotspot #37 from select enterobacterial strains. This hotspot is anchored by the leuX tRNA, an attachment site for a site-specific integrase. The spectrum of defense and metabolic activities found here is like the ICR, but family memberships are distinct.
Fig 12
Fig 12
Phage antirestriction activities and host countermeasures. Bacteriophages use a broad range of strategies to interfere with host restriction-modification systems (shown in boxes, left). On entry, these include phage DNA co-injection with protein inhibitors of RM, early expression of genes encoding proteins that stimulate the acquisition of host modification patterns, and prior chemical modification of phage DNA to shroud it from RM recognition. During development, a variety of encoded proteins block restriction through diverse means. In response (on the right), bacteria may encode additional immune systems that directly target antirestriction countermeasures. RM guardian systems such as PrrC and PARIS detect RM inhibition and activate abortive infection responses that drive cell death or growth arrest. Type IV MDREs (modification-dependent restriction endonucleases) directly target phage DNA modifications and hypermodifications.

References

    1. Roberts RJ, Belfort M, Bestor T, Bhagwat AS, Bickle TA, Bitinaite J, Blumenthal RM, Degtyarev SK, Dryden DTF, Dybvig K, et al. 2003. A nomenclature for restriction enzymes, DNA methyltransferases, homing endonucleases and their genes. Nucleic Acids Res 31:1805–1812. doi: 10.1093/nar/gkg274 - DOI - PMC - PubMed
    1. Roberts RJ, Vincze T, Posfai J, Macelis D. 2023. REBASE: a database for DNA restriction and modification: enzymes, genes and genomes. Nucleic Acids Res 51:D629–D630. doi: 10.1093/nar/gkac975 - DOI - PMC - PubMed
    1. Loenen WAM, Dryden DTF, Raleigh EA, Wilson GG, Murray NE. 2014. Highlights of the DNA cutters: a short history of the restriction enzymes. Nucleic Acids Res 42:3–19. doi: 10.1093/nar/gkt990 - DOI - PMC - PubMed
    1. Moralez J, Szenkiel K, Hamilton K, Pruden A, Lopatkin AJ. 2021. Quantitative analysis of horizontal gene transfer in complex systems. Curr Opin Microbiol 62:103–109. doi: 10.1016/j.mib.2021.05.001 - DOI - PubMed
    1. Brito IL. 2021. Examining horizontal gene transfer in microbial communities. Nat Rev Microbiol 19:442–453. doi: 10.1038/s41579-021-00534-7 - DOI - PubMed

MeSH terms

Substances

LinkOut - more resources