Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Review
. 2014 Dec;2(6):10.1128/microbiolspec.MDNA3-0029-2014.
doi: 10.1128/microbiolspec.MDNA3-0029-2014.

Diversity-generating Retroelements in Phage and Bacterial Genomes

Affiliations
Review

Diversity-generating Retroelements in Phage and Bacterial Genomes

Huatao Guo et al. Microbiol Spectr. 2014 Dec.

Abstract

Diversity-generating retroelements (DGRs) are DNA diversification machines found in diverse bacterial and bacteriophage genomes that accelerate the evolution of ligand-receptor interactions. Diversification results from a unidirectional transfer of sequence information from an invariant template repeat (TR) to a variable repeat (VR) located in a protein-encoding gene. Information transfer is coupled to site-specific mutagenesis in a process called mutagenic homing, which occurs through an RNA intermediate and is catalyzed by a unique, DGR-encoded reverse transcriptase that converts adenine residues in the TR into random nucleotides in the VR. In the prototype DGR found in the Bordetella bacteriophage BPP-1, the variable protein Mtd is responsible for phage receptor recognition. VR diversification enables progeny phage to switch tropism, accelerating their adaptation to changes in sequence or availability of host cell-surface molecules for infection. Since their discovery, hundreds of DGRs have been identified, and their functions are just beginning to be understood. VR-encoded residues of many DGR-diversified proteins are displayed in the context of a C-type lectin fold, although other scaffolds, including the immunoglobulin fold, may also be used. DGR homing is postulated to occur through a specialized target DNA-primed reverse transcription mechanism that allows repeated rounds of diversification and selection, and the ability to engineer DGRs to target heterologous genes suggests applications for bioengineering. This chapter provides a comprehensive review of our current understanding of this newly discovered family of beneficial retroelements.

PubMed Disclaimer

Figures

Figure 1
Figure 1
BPP-1 phage and its diversity-generating retroelement (DGR). (A) The BPP-1 genome is represented in the prophage form flanked by a duplication of the His-tRNA gene formed during integration. Functional assignments for most gene clusters are indicated, along with the cI-like repressor and the DGR cassette. (B) Schematic representation of the DGR cassette and its function in phage tropism switching. The cassette contains three genes (mtd, avd and brt) and two 134 bp repeats (template and variable repeats, or TR and VR, respectively). VR is located at the 3′ end of the mtd gene, which encodes the distal tail fiber protein responsible for receptor recognition. Located at the 3′ ends of VR and TR are IMH (Initiation of Mutagenic Homing) and IMH* elements, respectively, in addition to a GC-rich element. Phage tropism switching occurs through DGR-mediated mutagenic homing, in which TR sequence information is transferred to VR with adenine residues in TR appearing as random nucleotides in VR. Shown on the bottom are electron micrographs of the BPP-1 phage; globular structures at the distal ends of tail fibers are Mtd trimers (two per fiber). (C) Comparison of BPP-1 TR and VR. TR and VR sequences are shown in bold. VR variable positions and the corresponding adenine residues in TR are shown in red. IMH, IMH* and GC-rich elements are also indicated. There are 23 adenines in TR which can theoretically generate ~1014 different DNA sequences, or ~1013 different peptides. (Adapted from references & 9)
Figure 2
Figure 2
Diversification of a surface-displayed lipoprotein by a Legionella DGR. (A) The L. pneumophila strain Corby DGR is encoded on a genomic island that differs in G+C content from the rest of the genome. VR sequences at the 3′ end of the diversified locus, ldtA, are flanked by tandem hairpin/cruciform structures that are essential for efficient homing. TR contains 43 adenine residues which can create a potential repertoire of 1026 different VR DNA sequences. (B) LdtA contains atypical TAT (twin arginine transport) and Lpp (lipobox, lipid modification) signals at the N-terminus. (C) Cellular localization studies demonstrated that LdtA is exported through the inner membrane via the TAT pathway, lipid modified, and anchored on the outer surface of the outer membrane via an Lpp-like lipoprotein processing pathway. VR-encoded residues are surface displayed by a C-terminal CLec fold. (Adapted from reference 17)
Figure 3
Figure 3
The CLec fold as a scaffold for display of DGR-generated protein diversity. (A) The BPP-1 Mtd protein forms a pyramid-shaped homotrimer (Left) with VR-encoded residues exposed on the bottom surface. (Right) An Mtd monomer containing β-prism, β-sandwich, and VR-encoded CLec domains, from N- to C-terminus. (B) Backbone structures of the VR regions of 5 Mtd variants with different ligand specificities are shown. Despite side chain variations in diversified VR residues, the backbone structures are nearly superimposable. (C) Comparison of the CLec VR regions of BPP-1 Mtd and a Treponema denticola variable protein, TvpA. For Mtd-VR, the β2β3 loop of a second monomer is also shown (blue). (D) Superposition of the VR regions of BPP-1 Mtd (light orange) and T. denticola TvpA (blue). (E) Interaction of an Mtd homotrimer with the receptor protein pertactin. See text for details. (Adapted from references , and 22)
Figure 4
Figure 4
The TPRT model of BPP-1 DGR-mediated mutagenic homing. (A) Mutagenic homing occurs through a TR-RNA intermediate and is RecA-independent, similar to group II intron homing. A marker coconversion assay mapped the cDNA transfer boundary to a narrow region within the GC-rich element at the 3′ end, which may represent 3′ cDNA integration site(s). The marker transfer boundary at the 5′ end was more heterogeneous. A target DNA-primed reverse transcription model, similar to that of group II intron homing, has been proposed to explain these observations. The DNA target site was hypothesized to be nicked within the GC-rich element, with the exposed 3′ hydroxyl group serving as a primer for adenine-specific error-prone reverse transcription of the TR RNA. Integration of cDNA products at the 5′ end requires short stretches of homology between VR and the cDNA and may occur through strand displacement or template switching followed by break repair. Subsequent DNA replication would then create progeny genomes with mutagenized variable regions. (B) Deletion of VR sequence upstream of GC and IMH elements appeared to block 5′ cDNA integration but not 3′ cDNA integration, as analyzed by PCR with primer sets 1&4 and 2&3, respectively. Sequence analysis showed adenine mutagenesis in PCR products generated with primers 2&3. (C) 5′ cDNA integration in ΔVR1-99 was restored by inserting a 50 bp mtd sequence, which is homologous to the region upstream of the deletion junction, in TR. (Adapted from reference 7)
Figure 5
Figure 5
Role of a DNA secondary structure in DGR target recognition. (A) A DNA hairpin/cruciform structure downstream of VR is required for BPP-1 DGR target recognition. The wild type (WT) structure contains an 8 bp GC-rich stem and a 4 nt GAAA loop and is located 4 bp downstream of VR. Mutating the 3′ half of the stem (StMut) dramatically reduced DGR mutagenic homing (B) and phage tropism switching (not shown), while complementary changes to the 5′ half of the stem (StRev) restored DGR activity in both assays. (B) PCR-based DGR homing assays with sequence-tagged TRs and VRs flanked by WT or mutant stem sequences. Shown on the right is a diagram of the PCR assay. Green represents the tag sequence transferred from TR to VR. P1-4 are primers annealing to the tag or flanking regions. (C) Similar DNA structures are found at analogous positions in a number of other phage (two shown) and bacterial (one shown) DGRs. The phage stems are GC-rich and range from 7 to 10 bp, and loops have a conserved 4 nt sequence, G(A/G)NA. The L. pneumophila Corby DGR has a more complex tandem structure that is required for homing. (D) BPP-1 DGR target recognition at the 3′ end is both sequence and structure-dependent, requiring GC, IMH and a hairpin/cruciform structure. Target recognition at the 5′ end is homology-mediated. By inserting a gene of interest (GOI) upstream of GC, IMH and the DNA structure, the heterologous gene can be diversified by the BPP-1 DGR through appropriate engineering of TR. (Adapted from reference 41)
Figure 6
Figure 6
Avd and Brt. (A) (Left) The BPP-1 Avd protein forms a homopentameric structure, with each monomer containing 4 helices running up and down (side view). The pentamer is highly positively charged (top view; blue, positively charged; red, negatively charged). (Right) Amino acid residues on the side, top, and bottom of the Avd pentamer that were tested for Avd-Brt binding and/or DGR homing (27). (B) DGR RT domains and the sequence logo of its highly conserved domain R4. R1-R7 are conserved sequence blocks found in the finger and palm domains of retroviral RTs, such as HIV-1 RT (bottom). DGR RTs contain sequence insertions between R2 and R3 (R2a), and between R3 and R4 (R3a), as well as divergent N- and C-termini. They do not contain the thumb (Th) and RNase H (RH) domains that are found in HIV-1 RT. The domain R4 sequence logo of 155 DGR RTs was generated by Schillinger et al. using WebLogo (13, 49, 50). Comparison with the domain R4 sequence logo that we generated from 93 bacterial group II intron RTs [group II intron database: http://webapps2.ucalgary.ca/~groupii/orf/orfalignment.html; (–53)], which are most closely related to DGR RTs, showed several characteristic differences, including the two highly conserved positions labeled with *. Also included for comparison is the corresponding amino acid sequence block of HIV-1 RT (Strain BRU; accession # K02013). The glutamine residue at position 151, which plays a role in nucleotide and template preference during reverse transcription, is highlighted in blue. (Adapted from reference 13)

Similar articles

Cited by

References

    1. Vink C, Rudenko G, Seifert HS. Microbial antigenic variation mediated by homologous DNA recombination. FEMS Microbiol Rev. 2012;36:917–948. - PMC - PubMed
    1. Agrawal A, Eastman QM, Schatz DG. Transposition mediated by RAG1 and RAG2 and its implications for the evolution of the immune system. Nature. 1998;394:744–751. - PubMed
    1. Kapitonov VV, Jurka J. RAG1 core and V(D)J recombination signal sequences were derived from Transib transposons. PLoS Biol. 2005;3:e181. - PMC - PubMed
    1. Hencken CG, Li X, Craig NL. Functional characterization of an active Rag-like transposase. Nat Struct Mol Biol. 2012;19:834–836. - PMC - PubMed
    1. Liu M, Deora R, Doulatov SR, Gingery M, Eiserling FA, Preston A, Maskell DJ, Simons RW, Cotter PA, Parkhill J, Miller JF. Reverse transcriptase-mediated tropism switching in Bordetella bacteriophage. Science. 2002;295:2091–2094. - PubMed

Publication types