Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2018 May 4;46(8):4152-4163.
doi: 10.1093/nar/gky235.

Targeting IS608 transposon integration to highly specific sequences by structure-based transposon engineering

Affiliations

Targeting IS608 transposon integration to highly specific sequences by structure-based transposon engineering

Natalia Rosalía Morero et al. Nucleic Acids Res. .

Abstract

Transposable elements are efficient DNA carriers and thus important tools for transgenesis and insertional mutagenesis. However, their poor target sequence specificity constitutes an important limitation for site-directed applications. The insertion sequence IS608 from Helicobacter pylori recognizes a specific tetranucleotide sequence by base pairing, and its target choice can be re-programmed by changes in the transposon DNA. Here, we present the crystal structure of the IS608 target capture complex in an active conformation, providing a complete picture of the molecular interactions between transposon and target DNA prior to integration. Based on this, we engineered IS608 variants to direct their integration specifically to various 12/17-nt long target sites by extending the base pair interaction network between the transposon and the target DNA. We demonstrate in vitro that the engineered transposons efficiently select their intended target sites. Our data further elucidate how the distinct secondary structure of the single-stranded transposon intermediate prevents extended target specificity in the wild-type transposon, allowing it to move between diverse genomic sites. Our strategy enables efficient targeting of unique DNA sequences with high specificity in an easily programmable manner, opening possibilities for the use of the IS608 system for site-specific gene insertions.

PubMed Disclaimer

Figures

Figure 1.
Figure 1.
The IS608 structure and transposition mechanism (adapted from (16)). (A) The IS608 left (LE, red) and right (RE, blue) ends flank the tnpA and tnpB open reading frames (block arrows). Subterminal imperfect palindromes (IPR and IPL), right and left cleavage sites (CR: TCAA and CL: TTAC) and right and left guide sequences (GR: GAAT and GL: AAAG) are highlighted. Black wedges mark the positions of cleavage and 5′ phosphotyrosine TnpA-DNA intermediate formation. (B) Model of the IS608 transposition pathway. After transposon end cleavage (i), the donor DNA sequence is precisely sealed and a circular transposon junction is formed (ii) as an intermediate before cleavage and re-integration into a new target site (iii and iv). Black wedges mark the positions of cleavage at the transposon ends (i) and 3′ to a target cleavage site (CT) (iii). Specific base-pairing between guide and cleavage sequences in the transposon before excision (i), and between GL and CT before re-integration (iii), are indicated with dotted lines.
Figure 2.
Figure 2.
The IS608 target capture complex structure. (A) Overall view of the IS608 TnpA/LE29/T6’ structure. One of two synaptic complexes in the crystal asymmetric unit is shown. A TnpA dimer (cartoon representation, chain A in light blue and chain B in blue) is bound to two LE29 hairpin DNA molecules (red, GL in orange) and two T6’ target oligos including positions -5 to +1 (grey, with the C+1 nucleotide highlighted in black). Catalytic residues are shown in sticks representation with atomic colouring. Ca2+ ions are shown as green spheres. (B) The architecture of LE29 and its specific base contacts with T6’. Blue dotted arrows indicate non-canonical base interactions between A+42 and T+43 3′ of IPL with A+17 and A+16 from GL, respectively, which create base triplets together with T−4 and T−3 from T6’. (C) Two base triplets between LE29 and T6’ (bases in sticks representation with atomic colouring), with hydrogen bonds shown as dotted lines. (D) Superposition of the four TnpA/LE29/T6’ complexes (i to iv) present in the crystallographic asymmetric unit, highlighting the different orientations of C+1 (black sticks). (E) Coordination of the metal ion cofactor in the active site and the position of C+1 in the active pre-cleavage conformation (complex i). Catalytic residues as well as amino acids and DNA forming the C+1 binding site are shown as sticks with atomic colouring.
Figure 3.
Figure 3.
Sequence hallmarks affecting IS608 target selection. (A) Scheme of the IS608 left end (LE) and target oligos (Ti) used to monitor TnpA mediated cleavage with variable sequences upstream and downstream of the core TTAC target sequence (CT). Arrow indicates the position of target cleavage. (B) Cleavage assays monitoring covalent TnpA-DNA complex formation on SDS-PAGE gels. Upon Ti cleavage, TnpA becomes covalently attached to the variable 16 nt sequence downstream of the cleavage position and can be resolved from unmodified TnpA. Targets were classified into sets with good (SET-1) and poor activity (SET-2), as shown below the gel. Cleavage reactions are shown for representative SET-1 and SET-2 targets (lanes 2–4). The negative control (lane 1) does not contain target DNA. Cleavage reactions for derivatives of target 2.2, with the sequence upstream (u), downstream (d) of TTAC or both (ud) replaced by the corresponding sequence from target 1.1 (see sequences below SET-2) are shown in lanes 5–7. (C) Mutation of the nucleotide C in position +1 compromises cleavage in representative SET-1 targets (lanes 2–5), whereas introduction of a C at this position rescues activity of weak SET-2 targets (lanes 6–11). Covalent complex formation is monitored on SDS-PAGE and target sequences are shown below.
Figure 4.
Figure 4.
IS608 can be specifically targeted to longer integration sites by extended LE/target base pairing. (A) Close-up of the TnpA/LE29/T6’ structure, highlighting the proximity between the 3′ end of IPL and the 5′ end of the target oligonucleotide. The distance between the O5′ oxygen atom of A−5 in T6’ and the phosphorous atom (P) of A+44 in LE29 is 10.5 Å (dashed line). (B) Design of the IS608 transposon junction (Ji, where ‘i’ is a variable indicating a specific variant number) and complementary target substrates (Tic, with ‘i’ marking a specific variant as above) used for retargeting. Each set of Ji/Tic oligos was designed to include an 8 bp complementary region between the 3′ extension of the IPL and the sequence upstream of the native TTAC target site (light blue shade). The 8 bp complementary sequence displayed here corresponds to the J1/T1c pair. 32P radioisotope labeling is indicated by an asterisk. Upon target cleavage and integration (at the arrow), the radiolabeled 5′ segment of the junction upstream of the cleavage site (50 nt) is attached to the 3′ segment of the target (38 nt). (C) Sequencing DNA PAGE gel monitoring J1 cleavage and integration into its T1c complementary target. A random target substrate containing a TTAC site but no additional complementarity to the junction (marked as Tr) was used in a competition reaction with T1c (in 1:1 molar ratio) to monitor integration specificity (lane 5). Tr contains a shorter (30 nt) 3′ segment following the cleavage site than Tic, so that the integration products can be clearly distinguished. Schematics for the labeled junction substrate (a), the cleavage product (d) and integration products with T1c (b) or Tr (c) are shown on the right. (D) J1 integrates selectively into its complementary target substrate (T1c) even in the excess of scrambled target substrates. Competition assays with 2 different scrambled target pools (TsI and TsII), containing a conserved TTAC site and different sets of scrambled sequences in the 8 nt variable region, are shown. The molar ratio of T1c:TsI or T1c:TsII is indicated above the gel. Positions of the J1 substrate (a), cleavage (d) and strand transfer products with T1c (b) or TsI/TsII (c) in the sequencing gel are indicated by arrows.
Figure 5.
Figure 5.
IS608 integration specificity can be enhanced to 17 nt sites and retargeted to non-native target sites. (A) Integration reactions with a junction and target pair engineered to form 13 additional base pairs (see light blue shade in the scheme; J3 and T3c_1) are shown on sequencing PAGE (bottom). J3 integration to T3c_1 was compared with a target containing only 5 nt complementarity in the variable region (light blue; T3c_2) or a random target (Tr) maintaining only the GL/CT interaction. Target substrates contain various 3′ segments following the cleavage site to distinguish integration products. In competition reactions (lanes 6, 7), targets were combined in 1:1 molar ratio. Bands corresponding to the substrates and products are indicated on the right. (B) IS608 targeting to integration sites with alternative CT sequences. Two different sets of junction/target complementary pairs with mutations in GL and CT were designed (J4/T4c and J5/T5c), as shown on the top. Light-blue shade highlights the complementary regions and arrow marks the cleavage positions. Reaction products obtained with 5′ 32P- labeled J4 and J5 junctions and unlabeled targets were analyzed on a sequencing gel (bottom). Random targets T4r and T5r, containing the same CT sequence as in T4c and T5c, respectively, with a random sequence in the 8 nt variable region were used as control. T4r and T5r contain a shorter (30 nt) 3′ segment following the cleavage site. Integration in T4r or T5r is very inefficient even in large excess of the target substrate (lanes 4–6 and 11–13), whereas T4c and T5c produce more product (lanes 3 and 10) and compete favourably with the random targets (lanes 7 and 14). Substrates and products are shown schematically on the right.
Figure 6.
Figure 6.
Secondary structure of the IS608 LE limits target specificity. (A) Predicted secondary structure of the wild type IS608 junction (Jwt) depicts an additional hairpin downstream of IPL, which includes the positions used for extended retargeting (highlighted in light blue). To test the role of this hairpin, Jwt and J1 were engineered to disrupt and introduce, base pair complementarity within the second hairpin, respectively (bottom insert). (B) Integration activity of radiolabeled Jwt-oh and J1-h junction substrates with unlabeled target oligos containing a complementary (Twtc/T1c) or random (Tr) sequence in the 8 nt variable region used for retargeting. For competition reactions, Twtc/T1c and Tr were combined in 1:1 molar ratio (lanes 4, 7, 11, 14). Reaction substrates and products (identified by arrows on the right, as in Figure 4) were separated on a sequencing PAGE. (C) A variant of Jwt-oh including a mutation of A+42 to T (Jwt-oh-42T) was analyzed in combination with different Twtc:Tr ratios, as indicated.

References

    1. Lander E.S., Linton L.M., Birren B., Nusbaum C., Zody M.C., Baldwin J., Devon K., Dewar K., Doyle M., FitzHugh W. et al. . Initial sequencing and analysis of the human genome. Nature. 2001; 409:860–921. - PubMed
    1. Chain P.S., Carniel E., Larimer F.W., Lamerdin J., Stoutland P.O., Regala W.M., Georgescu A.M., Vergez L.M., Land M.L., Motin V.L. et al. . Insights into the evolution of Yersinia pestis through whole-genome comparison with Yersinia pseudotuberculosis. Proc. Natl. Acad. Sci. U.S.A. 2004; 101:13826–13831. - PMC - PubMed
    1. Wei J., Goldberg M.B., Burland V., Venkatesan M.M., Deng W., Fournier G., Mayhew G.F., Plunkett G. 3rd, Rose D.J., Darling A. et al. . Complete genome sequence and comparative genomics of Shigella flexneri serotype 2a strain 2457T. Infect. Immun. 2003; 71:2775–2786. - PMC - PubMed
    1. Curcio M.J., Derbyshire K.M.. The outs and ins of transposition: from mu to kangaroo. Nat. Rev. Mol. Cell Biol. 2003; 4:865–877. - PubMed
    1. Chen J.M., Stenson P.D., Cooper D.N., Ferec C.. A systematic analysis of LINE-1 endonuclease-dependent retrotranspositional events causing human genetic disease. Hum. Genet. 2005; 117:411–427. - PubMed

Publication types