Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2000 Sep 15;28(18):3417-32.
doi: 10.1093/nar/28.18.3417.

SURVEY AND SUMMARY: holliday junction resolvases and related nucleases: identification of new families, phyletic distribution and evolutionary trajectories

Affiliations

SURVEY AND SUMMARY: holliday junction resolvases and related nucleases: identification of new families, phyletic distribution and evolutionary trajectories

L Aravind et al. Nucleic Acids Res. .

Abstract

Holliday junction resolvases (HJRs) are key enzymes of DNA recombination. A detailed computer analysis of the structural and evolutionary relationships of HJRs and related nucleases suggests that the HJR function has evolved independently from at least four distinct structural folds, namely RNase H, endonuclease, endonuclease VII-colicin E and RusA. The endonuclease fold, whose structural prototypes are the phage lambda exonuclease, the very short patch repair nuclease (Vsr) and type II restriction enzymes, is shown to encompass by far a greater diversity of nucleases than previously suspected. This fold unifies archaeal HJRs, repair nucleases such as RecB and Vsr, restriction enzymes and a variety of predicted nucleases whose specific activities remain to be determined. Within the RNase H fold a new family of predicted HJRs, which is nearly ubiquitous in bacteria, was discovered, in addition to the previously characterized RuvC family. The proteins of this family, typified by Escherichia coli YqgF, are likely to function as an alternative to RuvC in most bacteria, but could be the principal HJRs in low-GC Gram-positive bacteria and AQUIFEX: Endonuclease VII of phage T4 is shown to serve as a structural template for many nucleases, including MCR:A and other type II restriction enzymes. Together with colicin E7, endonuclease VII defines a distinct metal-dependent nuclease fold. As a result of this analysis, the principal HJRs are now known or confidently predicted for all bacteria and archaea whose genomes have been completely sequenced, with many species encoding multiple potential HJRs. Horizontal gene transfer, lineage-specific gene loss and gene family expansion, and non-orthologous gene displacement seem to have been major forces in the evolution of HJRs and related nucleases. A remarkable case of displacement is seen in the Lyme disease spirochete Borrelia burgdorferi, which does not possess any of the typical HJRs, but instead encodes, in its chromosome and each of the linear plasmids, members of the lambda exonuclease family predicted to function as HJRs. The diversity of HJRs and related nucleases in bacteria and archaea contrasts with their near absence in eukaryotes. The few detected eukaryotic representatives of the endonuclease fold and the RNase H fold have probably been acquired from bacteria via horizontal gene transfer. The identity of the principal HJR(s) involved in recombination in eukaryotes remains uncertain; this function could be performed by topoisomerase IB or by a novel, so far undetected, class of enzymes. Likely HJRs and related nucleases were identified in the genomes of numerous bacterial and eukaryotic DNA viruses. Gene flow between viral and cellular genomes has probably played a major role in the evolution of this class of enzymes. This analysis resulted in the prediction of numerous previously unnoticed nucleases, some of which are likely to be new restriction enzymes.

PubMed Disclaimer

Figures

Figure 1
Figure 1
Multiple alignments of the HJRs of the RNase H fold. (A) RuvC family. (B) YqgF family. Each protein is labeled using the gene name followed by the species abbreviation and the GenBank gene identifier. The extent of the domain in each protein is indicated by numbers to the sides of the alignments. The long poorly conserved inserts are replaced by numbers indicating the number of omitted residues. Only Motifs I and IV are aligned between the RuvC and YqgF families. The secondary structure predicted using the PHD program is shown above the alignment with H/h for α-helix and E/e for β-strands (upper case denoting strong prediction and lower case moderate prediction). The shading and coloring are according to the 90% consensus, which is shown underneath the alignment, with the following convention: h, hydrophobic residues (YFWLIVMA); l, aliphatic residues (LIVMA); a, aromatic residues (YFWH), yellow background; p, polar residues (STQNEDRKH), red foreground; s, small residues (SAGTVPNHD), turquoise background; u, tiny residues (GAS), light green background; c, charged residues (KHRED), magenta foreground; b, bulky residues (LIYWFEQRKM), gray background. The residues predicted to form the active site or associated with catalysis are shown in inverse coloring. The species abbreviations are as indicated in Table 1. The following are abbreviations not shown in Table 1: Wolsp, Wolbachia sp.; MSEPV, Melanopus sanguinus entomopox virus; MCV, Molluscum contagiousum virus; VacV, vaccinia virus; biL66 and LBPc2, lactococcal phages biL66 and c2.
Figure 2
Figure 2
Topological diagrams of the HJRs of the RNase H fold. The α-helices are shown by green cylinders and the β-strands by violet arrows. The strands are numbered in the order of occurrence. Note the distinct positions of conserved glutamates in the RuvC and YqgF families.
Figure 3
Figure 3
Multiple alignments of the HJRs and related nucleases of the endonuclease fold. (A) A schematic of the conserved motifs containing the (predicted) catalytic residues. (B) Superfamily I: the AHJR–Mrr family. (C) Superfamily I: the RecB family. (D) Superfamily I: the PHAC family. (E) Superfamily I: the λ exonuclease family. (F) Superfamily II: Vsr homologs. The schematic representation in (A) shows the configuration of the three conserved motifs of superfamilies I and II as well as certain family-specific motifs described in the text. The conserved residues that are present in >25% of the cases are shown by the single letter code in upper case. In other cases the general consensus category for the residues as indicated in the legend to Figure 1 is shown in lower case. The alignment notation is as indicated in the legend to Figure 1. The conserved motifs of each superfamily are indicated above the alignment. All families of superfamily I share three conserved motifs as shown in (A), but because of the absence of extended sequence similarity between the families, the alignment for each family is shown separately (B–F). In (E) only two sequences from B.burgdorferi, BB036 from the chromosome and BBC12 from linear plasmid C, are shown; the remaining plasmid-encoded sequences are nearly identical to these. Blue letters in some of the sequences in the alignment indicate anamolous inserts that have been excised. Additional species abbreviations: Sty, Salmonella typhimurium; Rsph, Rhodopseudomonas spheroides; Hs, Homo sapiens; Ll, Lactococcus lactis; Ban, Bacillus anthracis; Tfo, Thiobacillus ferroxidans; Vic, Vibrio cholerae; Coxb, Coxiella burnetti; Rhi, Rhizobium sp.; LPA118, Listeria phage A118; NPVAC, AcMNPV and NPVOP, nuclear polyhedrosis viruses of Autographa califorinica, Bombyx mori and Orgyia pseudotsugata; EBV, Epstein–Barr virus; KSV, Kaposi sarcoma virus; HSVSA, herpes virus saimiri; HSVEB, equine herpes virus B; VZVD, varicella zoster virus D.
Figure 3
Figure 3
Multiple alignments of the HJRs and related nucleases of the endonuclease fold. (A) A schematic of the conserved motifs containing the (predicted) catalytic residues. (B) Superfamily I: the AHJR–Mrr family. (C) Superfamily I: the RecB family. (D) Superfamily I: the PHAC family. (E) Superfamily I: the λ exonuclease family. (F) Superfamily II: Vsr homologs. The schematic representation in (A) shows the configuration of the three conserved motifs of superfamilies I and II as well as certain family-specific motifs described in the text. The conserved residues that are present in >25% of the cases are shown by the single letter code in upper case. In other cases the general consensus category for the residues as indicated in the legend to Figure 1 is shown in lower case. The alignment notation is as indicated in the legend to Figure 1. The conserved motifs of each superfamily are indicated above the alignment. All families of superfamily I share three conserved motifs as shown in (A), but because of the absence of extended sequence similarity between the families, the alignment for each family is shown separately (B–F). In (E) only two sequences from B.burgdorferi, BB036 from the chromosome and BBC12 from linear plasmid C, are shown; the remaining plasmid-encoded sequences are nearly identical to these. Blue letters in some of the sequences in the alignment indicate anamolous inserts that have been excised. Additional species abbreviations: Sty, Salmonella typhimurium; Rsph, Rhodopseudomonas spheroides; Hs, Homo sapiens; Ll, Lactococcus lactis; Ban, Bacillus anthracis; Tfo, Thiobacillus ferroxidans; Vic, Vibrio cholerae; Coxb, Coxiella burnetti; Rhi, Rhizobium sp.; LPA118, Listeria phage A118; NPVAC, AcMNPV and NPVOP, nuclear polyhedrosis viruses of Autographa califorinica, Bombyx mori and Orgyia pseudotsugata; EBV, Epstein–Barr virus; KSV, Kaposi sarcoma virus; HSVSA, herpes virus saimiri; HSVEB, equine herpes virus B; VZVD, varicella zoster virus D.
Figure 3
Figure 3
Multiple alignments of the HJRs and related nucleases of the endonuclease fold. (A) A schematic of the conserved motifs containing the (predicted) catalytic residues. (B) Superfamily I: the AHJR–Mrr family. (C) Superfamily I: the RecB family. (D) Superfamily I: the PHAC family. (E) Superfamily I: the λ exonuclease family. (F) Superfamily II: Vsr homologs. The schematic representation in (A) shows the configuration of the three conserved motifs of superfamilies I and II as well as certain family-specific motifs described in the text. The conserved residues that are present in >25% of the cases are shown by the single letter code in upper case. In other cases the general consensus category for the residues as indicated in the legend to Figure 1 is shown in lower case. The alignment notation is as indicated in the legend to Figure 1. The conserved motifs of each superfamily are indicated above the alignment. All families of superfamily I share three conserved motifs as shown in (A), but because of the absence of extended sequence similarity between the families, the alignment for each family is shown separately (B–F). In (E) only two sequences from B.burgdorferi, BB036 from the chromosome and BBC12 from linear plasmid C, are shown; the remaining plasmid-encoded sequences are nearly identical to these. Blue letters in some of the sequences in the alignment indicate anamolous inserts that have been excised. Additional species abbreviations: Sty, Salmonella typhimurium; Rsph, Rhodopseudomonas spheroides; Hs, Homo sapiens; Ll, Lactococcus lactis; Ban, Bacillus anthracis; Tfo, Thiobacillus ferroxidans; Vic, Vibrio cholerae; Coxb, Coxiella burnetti; Rhi, Rhizobium sp.; LPA118, Listeria phage A118; NPVAC, AcMNPV and NPVOP, nuclear polyhedrosis viruses of Autographa califorinica, Bombyx mori and Orgyia pseudotsugata; EBV, Epstein–Barr virus; KSV, Kaposi sarcoma virus; HSVSA, herpes virus saimiri; HSVEB, equine herpes virus B; VZVD, varicella zoster virus D.
Figure 3
Figure 3
Multiple alignments of the HJRs and related nucleases of the endonuclease fold. (A) A schematic of the conserved motifs containing the (predicted) catalytic residues. (B) Superfamily I: the AHJR–Mrr family. (C) Superfamily I: the RecB family. (D) Superfamily I: the PHAC family. (E) Superfamily I: the λ exonuclease family. (F) Superfamily II: Vsr homologs. The schematic representation in (A) shows the configuration of the three conserved motifs of superfamilies I and II as well as certain family-specific motifs described in the text. The conserved residues that are present in >25% of the cases are shown by the single letter code in upper case. In other cases the general consensus category for the residues as indicated in the legend to Figure 1 is shown in lower case. The alignment notation is as indicated in the legend to Figure 1. The conserved motifs of each superfamily are indicated above the alignment. All families of superfamily I share three conserved motifs as shown in (A), but because of the absence of extended sequence similarity between the families, the alignment for each family is shown separately (B–F). In (E) only two sequences from B.burgdorferi, BB036 from the chromosome and BBC12 from linear plasmid C, are shown; the remaining plasmid-encoded sequences are nearly identical to these. Blue letters in some of the sequences in the alignment indicate anamolous inserts that have been excised. Additional species abbreviations: Sty, Salmonella typhimurium; Rsph, Rhodopseudomonas spheroides; Hs, Homo sapiens; Ll, Lactococcus lactis; Ban, Bacillus anthracis; Tfo, Thiobacillus ferroxidans; Vic, Vibrio cholerae; Coxb, Coxiella burnetti; Rhi, Rhizobium sp.; LPA118, Listeria phage A118; NPVAC, AcMNPV and NPVOP, nuclear polyhedrosis viruses of Autographa califorinica, Bombyx mori and Orgyia pseudotsugata; EBV, Epstein–Barr virus; KSV, Kaposi sarcoma virus; HSVSA, herpes virus saimiri; HSVEB, equine herpes virus B; VZVD, varicella zoster virus D.
Figure 3
Figure 3
Multiple alignments of the HJRs and related nucleases of the endonuclease fold. (A) A schematic of the conserved motifs containing the (predicted) catalytic residues. (B) Superfamily I: the AHJR–Mrr family. (C) Superfamily I: the RecB family. (D) Superfamily I: the PHAC family. (E) Superfamily I: the λ exonuclease family. (F) Superfamily II: Vsr homologs. The schematic representation in (A) shows the configuration of the three conserved motifs of superfamilies I and II as well as certain family-specific motifs described in the text. The conserved residues that are present in >25% of the cases are shown by the single letter code in upper case. In other cases the general consensus category for the residues as indicated in the legend to Figure 1 is shown in lower case. The alignment notation is as indicated in the legend to Figure 1. The conserved motifs of each superfamily are indicated above the alignment. All families of superfamily I share three conserved motifs as shown in (A), but because of the absence of extended sequence similarity between the families, the alignment for each family is shown separately (B–F). In (E) only two sequences from B.burgdorferi, BB036 from the chromosome and BBC12 from linear plasmid C, are shown; the remaining plasmid-encoded sequences are nearly identical to these. Blue letters in some of the sequences in the alignment indicate anamolous inserts that have been excised. Additional species abbreviations: Sty, Salmonella typhimurium; Rsph, Rhodopseudomonas spheroides; Hs, Homo sapiens; Ll, Lactococcus lactis; Ban, Bacillus anthracis; Tfo, Thiobacillus ferroxidans; Vic, Vibrio cholerae; Coxb, Coxiella burnetti; Rhi, Rhizobium sp.; LPA118, Listeria phage A118; NPVAC, AcMNPV and NPVOP, nuclear polyhedrosis viruses of Autographa califorinica, Bombyx mori and Orgyia pseudotsugata; EBV, Epstein–Barr virus; KSV, Kaposi sarcoma virus; HSVSA, herpes virus saimiri; HSVEB, equine herpes virus B; VZVD, varicella zoster virus D.
Figure 3
Figure 3
Multiple alignments of the HJRs and related nucleases of the endonuclease fold. (A) A schematic of the conserved motifs containing the (predicted) catalytic residues. (B) Superfamily I: the AHJR–Mrr family. (C) Superfamily I: the RecB family. (D) Superfamily I: the PHAC family. (E) Superfamily I: the λ exonuclease family. (F) Superfamily II: Vsr homologs. The schematic representation in (A) shows the configuration of the three conserved motifs of superfamilies I and II as well as certain family-specific motifs described in the text. The conserved residues that are present in >25% of the cases are shown by the single letter code in upper case. In other cases the general consensus category for the residues as indicated in the legend to Figure 1 is shown in lower case. The alignment notation is as indicated in the legend to Figure 1. The conserved motifs of each superfamily are indicated above the alignment. All families of superfamily I share three conserved motifs as shown in (A), but because of the absence of extended sequence similarity between the families, the alignment for each family is shown separately (B–F). In (E) only two sequences from B.burgdorferi, BB036 from the chromosome and BBC12 from linear plasmid C, are shown; the remaining plasmid-encoded sequences are nearly identical to these. Blue letters in some of the sequences in the alignment indicate anamolous inserts that have been excised. Additional species abbreviations: Sty, Salmonella typhimurium; Rsph, Rhodopseudomonas spheroides; Hs, Homo sapiens; Ll, Lactococcus lactis; Ban, Bacillus anthracis; Tfo, Thiobacillus ferroxidans; Vic, Vibrio cholerae; Coxb, Coxiella burnetti; Rhi, Rhizobium sp.; LPA118, Listeria phage A118; NPVAC, AcMNPV and NPVOP, nuclear polyhedrosis viruses of Autographa califorinica, Bombyx mori and Orgyia pseudotsugata; EBV, Epstein–Barr virus; KSV, Kaposi sarcoma virus; HSVSA, herpes virus saimiri; HSVEB, equine herpes virus B; VZVD, varicella zoster virus D.
Figure 4
Figure 4
Topological diagrams of the enzymes of the endonuclease fold. (A) λ exonuclease, the structural template for superfamily I. (B) Vsr, the structural template for superfamily II. The positions of conserved motifs indicated in Figure 3 are shown by arrows and the residues involved in catalysis are indicated; the coordinated metal cations are shown by yellow circles.
Figure 5
Figure 5
Domain architectures of HJRs and related nucleases. Nuclease domains are indicated by the following abbreviations: McrA, EndoVII fold nuclease domain; RecB, RecB family nuclease of the endonuclease fold; PHAC, PHAC family nuclease of the endonuclease fold; AHJR, nuclease of the archaeal HJR family of the endonuclease fold; Vsr_nuc, nuclease of the Vsr–YcjD superfamily of the endonuclease fold. The proteins are labeled as in the alignment figures. For the helicase domains of superfamilies I and II (SFI and SFII) the closest functionally characterized homologs are indicated. Domain abbreviations: URI, UvrC-Intron nuclease domain; MutS, mismatch repair ATPase; C4, four-cysteine Zn cluster; C3, three three-cysteine Zn cluster; TOP C4, Zn ribbon domain related to that at the C-termini of topoisomerase IA; PriA, predicted Zn-binding domain shared with the PriA helicases; HTH, helix–turn–helix DNA-binding domain; DGQQR, uncharacterized conserved domain found in a diverse set of bacterial and archaeal proteins and designated by its characteristic amino acid signature (L.Aravind and E.V.Koonin, unpublished observations); PDA, PHAC/DGQQR-associated domain (L.Aravind, unpublished observations); SAD, uncharacterized conserved domain associated with the SET domain in several chromatin-associated proteins. A coiled-coil domain inserted into the helicase domain of MTH487 is indicated by a brown bar and signal peptides and transmembrane regions are indicated by yellow bars.
Figure 6
Figure 6
Multiple alignment of the Holliday junction resolvases and related nucleases of the EndoVII–colicin E fold. The alignment notation is as indicated in the legend to Figure 1. Additional species abbreviations: Sgl, Sarcophyton glaucum; Kpn, Klebsiella pneumoniae; Phi31, Lactococcus phage φ31; PHIC31, Streptomyces phage φC 31; Phi105, Bacillus subtilis phage φ105; Sphae, Streptomyces phaeochromogenes; Scp, Saccharopolyspora sp.; Mb, Moraxella bovis.
Figure 7
Figure 7
Structures of nucleases of the EndoVII–colicin E fold. (A) Colicin E. (B) EndoVII. The residues involved in chelating the active metal and the stabilizing zinc cluster in EndoVII are shown in the ball-and-stick representation. The orientations of the N-terminal helices differ in the two families. Note the highly conserved histidine shared by these families that faces away from the chelated active metal and is required for catalysis.

Similar articles

Cited by

References

    1. Holliday R. (1964) Genet. Res., 5, 282–304.
    1. Friedberg E.C., Walker,G.C. and Siede,W. (1995) DNA Repair and Mutagenesis. American Society for Microbiology, Washington, DC.
    1. Lindahl T. and West,S.C. (1995) DNA Repair and Recombination. Chapman & Hall, London, UK.
    1. Smith P.J. and Jones,C. (2000) DNA Recombination and Repair. Oxford University Press, Oxford, UK.
    1. Kowalczykowski S.C., Dixon,D.A., Eggleston,A.K., Lauder,S.D. and Rehrauer,W.M. (1994) Microbiol. Rev., 58, 401–465. - PMC - PubMed

Publication types

MeSH terms