Targeting IS608 transposon integration to highly specific sequences by structure-based transposon engineering

Affiliations

¹ Structural and Computational Biology Unit, European Molecular Biology Laboratory, Heidelberg 69117, Germany.
² Laboratoire de Microbiologie et Génétique Moléculaires, Centre National de la Recherche Scientifique, Toulouse Cedex 31062, France.
³ Laboratory of Molecular Biology, National Institute of Diabetes and Digestive and Kidney Diseases, National Institutes of Health, Bethesda, MD 20892, USA.

PMID: 29635476
PMCID: PMC5934647
DOI: 10.1093/nar/gky235

Targeting IS608 transposon integration to highly specific sequences by structure-based transposon engineering

Natalia Rosalía Morero et al. Nucleic Acids Res. 2018.

. 2018 May 4;46(8):4152-4163.

doi: 10.1093/nar/gky235.

Affiliations

¹ Structural and Computational Biology Unit, European Molecular Biology Laboratory, Heidelberg 69117, Germany.
² Laboratoire de Microbiologie et Génétique Moléculaires, Centre National de la Recherche Scientifique, Toulouse Cedex 31062, France.
³ Laboratory of Molecular Biology, National Institute of Diabetes and Digestive and Kidney Diseases, National Institutes of Health, Bethesda, MD 20892, USA.

PMID: 29635476
PMCID: PMC5934647
DOI: 10.1093/nar/gky235

Abstract

Transposable elements are efficient DNA carriers and thus important tools for transgenesis and insertional mutagenesis. However, their poor target sequence specificity constitutes an important limitation for site-directed applications. The insertion sequence IS608 from Helicobacter pylori recognizes a specific tetranucleotide sequence by base pairing, and its target choice can be re-programmed by changes in the transposon DNA. Here, we present the crystal structure of the IS608 target capture complex in an active conformation, providing a complete picture of the molecular interactions between transposon and target DNA prior to integration. Based on this, we engineered IS608 variants to direct their integration specifically to various 12/17-nt long target sites by extending the base pair interaction network between the transposon and the target DNA. We demonstrate in vitro that the engineered transposons efficiently select their intended target sites. Our data further elucidate how the distinct secondary structure of the single-stranded transposon intermediate prevents extended target specificity in the wild-type transposon, allowing it to move between diverse genomic sites. Our strategy enables efficient targeting of unique DNA sequences with high specificity in an easily programmable manner, opening possibilities for the use of the IS608 system for site-specific gene insertions.

PubMed Disclaimer

Figures

**Figure 1.**
The IS608 structure and transposition mechanism (adapted from (16)). (A) The IS608 left (LE, red) and right (RE, blue) ends flank the *tnpA* and *tnpB* open reading frames (block arrows). Subterminal imperfect palindromes (IP_R and IP_L), right and left cleavage sites (C_R: TCAA and C_L: TTAC) and right and left guide sequences (G_R: GAAT and G_L: AAAG) are highlighted. Black wedges mark the positions of cleavage and 5′ phosphotyrosine TnpA-DNA intermediate formation. (B) Model of the IS608 transposition pathway. After transposon end cleavage (i), the donor DNA sequence is precisely sealed and a circular transposon junction is formed (ii) as an intermediate before cleavage and re-integration into a new target site (iii and iv). Black wedges mark the positions of cleavage at the transposon ends (i) and 3′ to a target cleavage site (C_T) (iii). Specific base-pairing between guide and cleavage sequences in the transposon before excision (i), and between G_L and C_T before re-integration (iii), are indicated with dotted lines.

**Figure 2.**
The IS608 target capture complex structure. (A) Overall view of the IS608 TnpA/LE29/T6’ structure. One of two synaptic complexes in the crystal asymmetric unit is shown. A TnpA dimer (cartoon representation, chain A in light blue and chain B in blue) is bound to two LE29 hairpin DNA molecules (red, G_L in orange) and two T6’ target oligos including positions -5 to +1 (grey, with the C⁺¹ nucleotide highlighted in black). Catalytic residues are shown in sticks representation with atomic colouring. Ca²⁺ ions are shown as green spheres. (B) The architecture of LE29 and its specific base contacts with T6’. Blue dotted arrows indicate non-canonical base interactions between A⁺⁴² and T⁺⁴³ 3′ of IP_L with A⁺¹⁷ and A⁺¹⁶ from G_L, respectively, which create base triplets together with T⁻⁴ and T⁻³ from T6’. (C) Two base triplets between LE29 and T6’ (bases in sticks representation with atomic colouring), with hydrogen bonds shown as dotted lines. (D) Superposition of the four TnpA/LE29/T6’ complexes (i to iv) present in the crystallographic asymmetric unit, highlighting the different orientations of C⁺¹ (black sticks). (E) Coordination of the metal ion cofactor in the active site and the position of C⁺¹ in the active pre-cleavage conformation (complex i). Catalytic residues as well as amino acids and DNA forming the C⁺¹ binding site are shown as sticks with atomic colouring.

**Figure 3.**
Sequence hallmarks affecting IS608 target selection. (A) Scheme of the IS608 left end (LE) and target oligos (Ti) used to monitor TnpA mediated cleavage with variable sequences upstream and downstream of the core TTAC target sequence (C_T). Arrow indicates the position of target cleavage. (B) Cleavage assays monitoring covalent TnpA-DNA complex formation on SDS-PAGE gels. Upon Ti cleavage, TnpA becomes covalently attached to the variable 16 nt sequence downstream of the cleavage position and can be resolved from unmodified TnpA. Targets were classified into sets with good (SET-1) and poor activity (SET-2), as shown below the gel. Cleavage reactions are shown for representative SET-1 and SET-2 targets (lanes 2–4). The negative control (lane 1) does not contain target DNA. Cleavage reactions for derivatives of target 2.2, with the sequence upstream (u), downstream (d) of TTAC or both (ud) replaced by the corresponding sequence from target 1.1 (see sequences below SET-2) are shown in lanes 5–7. (C) Mutation of the nucleotide C in position +1 compromises cleavage in representative SET-1 targets (lanes 2–5), whereas introduction of a C at this position rescues activity of weak SET-2 targets (lanes 6–11). Covalent complex formation is monitored on SDS-PAGE and target sequences are shown below.

**Figure 4.**
IS608 can be specifically targeted to longer integration sites by extended LE/target base pairing. (A) Close-up of the TnpA/LE29/T6’ structure, highlighting the proximity between the 3′ end of IP_L and the 5′ end of the target oligonucleotide. The distance between the O5′ oxygen atom of A⁻⁵ in T6’ and the phosphorous atom (P) of A⁺⁴⁴ in LE29 is 10.5 Å (dashed line). (B) Design of the IS608 transposon junction (Ji, where ‘i’ is a variable indicating a specific variant number) and complementary target substrates (Tic, with ‘i’ marking a specific variant as above) used for retargeting. Each set of Ji/Tic oligos was designed to include an 8 bp complementary region between the 3′ extension of the IP_L and the sequence upstream of the native TTAC target site (light blue shade). The 8 bp complementary sequence displayed here corresponds to the J1/T1c pair. ³²P radioisotope labeling is indicated by an asterisk. Upon target cleavage and integration (at the arrow), the radiolabeled 5′ segment of the junction upstream of the cleavage site (50 nt) is attached to the 3′ segment of the target (38 nt). (C) Sequencing DNA PAGE gel monitoring J1 cleavage and integration into its T1c complementary target. A random target substrate containing a TTAC site but no additional complementarity to the junction (marked as Tr) was used in a competition reaction with T1c (in 1:1 molar ratio) to monitor integration specificity (lane 5). Tr contains a shorter (30 nt) 3′ segment following the cleavage site than Tic, so that the integration products can be clearly distinguished. Schematics for the labeled junction substrate (a), the cleavage product (d) and integration products with T1c (b) or Tr (c) are shown on the right. (D) J1 integrates selectively into its complementary target substrate (T1c) even in the excess of scrambled target substrates. Competition assays with 2 different scrambled target pools (TsI and TsII), containing a conserved TTAC site and different sets of scrambled sequences in the 8 nt variable region, are shown. The molar ratio of T1c:TsI or T1c:TsII is indicated above the gel. Positions of the J1 substrate (a), cleavage (d) and strand transfer products with T1c (b) or TsI/TsII (c) in the sequencing gel are indicated by arrows.

**Figure 5.**
IS608 integration specificity can be enhanced to 17 nt sites and retargeted to non-native target sites. (A) Integration reactions with a junction and target pair engineered to form 13 additional base pairs (see light blue shade in the scheme; J3 and T3c_1) are shown on sequencing PAGE (bottom). J3 integration to T3c_1 was compared with a target containing only 5 nt complementarity in the variable region (light blue; T3c_2) or a random target (Tr) maintaining only the G_L/C_T interaction. Target substrates contain various 3′ segments following the cleavage site to distinguish integration products. In competition reactions (lanes 6, 7), targets were combined in 1:1 molar ratio. Bands corresponding to the substrates and products are indicated on the right. (B) IS608 targeting to integration sites with alternative C_T sequences. Two different sets of junction/target complementary pairs with mutations in G_L and C_T were designed (J4/T4c and J5/T5c), as shown on the top. Light-blue shade highlights the complementary regions and arrow marks the cleavage positions. Reaction products obtained with 5′ ³²P- labeled J4 and J5 junctions and unlabeled targets were analyzed on a sequencing gel (bottom). Random targets T4r and T5r, containing the same C_T sequence as in T4c and T5c, respectively, with a random sequence in the 8 nt variable region were used as control. T4r and T5r contain a shorter (30 nt) 3′ segment following the cleavage site. Integration in T4r or T5r is very inefficient even in large excess of the target substrate (lanes 4–6 and 11–13), whereas T4c and T5c produce more product (lanes 3 and 10) and compete favourably with the random targets (lanes 7 and 14). Substrates and products are shown schematically on the right.

**Figure 6.**
Secondary structure of the IS608 LE limits target specificity. (A) Predicted secondary structure of the wild type IS608 junction (Jwt) depicts an additional hairpin downstream of IP_L, which includes the positions used for extended retargeting (highlighted in light blue). To test the role of this hairpin, Jwt and J1 were engineered to disrupt and introduce, base pair complementarity within the second hairpin, respectively (bottom insert). (B) Integration activity of radiolabeled Jwt-oh and J1-h junction substrates with unlabeled target oligos containing a complementary (Twtc/T1c) or random (Tr) sequence in the 8 nt variable region used for retargeting. For competition reactions, Twtc/T1c and Tr were combined in 1:1 molar ratio (lanes 4, 7, 11, 14). Reaction substrates and products (identified by arrows on the right, as in Figure 4) were separated on a sequencing PAGE. (C) A variant of Jwt-oh including a mutation of A⁺⁴² to T (Jwt-oh-42T) was analyzed in combination with different Twtc:Tr ratios, as indicated.

See this image and copyright information in PMC

References

1. Lander E.S., Linton L.M., Birren B., Nusbaum C., Zody M.C., Baldwin J., Devon K., Dewar K., Doyle M., FitzHugh W. et al. . Initial sequencing and analysis of the human genome. Nature. 2001; 409:860–921. - PubMed
1. Chain P.S., Carniel E., Larimer F.W., Lamerdin J., Stoutland P.O., Regala W.M., Georgescu A.M., Vergez L.M., Land M.L., Motin V.L. et al. . Insights into the evolution of Yersinia pestis through whole-genome comparison with Yersinia pseudotuberculosis. Proc. Natl. Acad. Sci. U.S.A. 2004; 101:13826–13831. - PMC - PubMed
1. Wei J., Goldberg M.B., Burland V., Venkatesan M.M., Deng W., Fournier G., Mayhew G.F., Plunkett G. 3rd, Rose D.J., Darling A. et al. . Complete genome sequence and comparative genomics of Shigella flexneri serotype 2a strain 2457T. Infect. Immun. 2003; 71:2775–2786. - PMC - PubMed
1. Curcio M.J., Derbyshire K.M.. The outs and ins of transposition: from mu to kangaroo. Nat. Rev. Mol. Cell Biol. 2003; 4:865–877. - PubMed
1. Chen J.M., Stenson P.D., Cooper D.N., Ferec C.. A systematic analysis of LINE-1 endonuclease-dependent retrotranspositional events causing human genetic disease. Hum. Genet. 2005; 117:411–427. - PubMed

Publication types

Actions

MeSH terms

Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions

Substances

Actions
Actions
Actions

LinkOut - more resources

Full Text Sources
Other Literature Sources
- The Lens - Patent Citations Database
- scite Smart Citations
Research Materials
- NCI CPTC Antibody Characterization Program

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Targeting IS608 transposon integration to highly specific sequences by structure-based transposon engineering

Affiliations

Targeting IS608 transposon integration to highly specific sequences by structure-based transposon engineering

Authors

Affiliations

Abstract

Figures

References

Publication types

MeSH terms

Substances

LinkOut - more resources

Full Text Sources

Other Literature Sources

Research Materials