Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Review
. 2009 Jun;276(11):2926-46.
doi: 10.1111/j.1742-4658.2009.07009.x. Epub 2009 Apr 14.

Piecing together the structure of retroviral integrase, an important target in AIDS therapy

Affiliations
Review

Piecing together the structure of retroviral integrase, an important target in AIDS therapy

Mariusz Jaskolski et al. FEBS J. 2009 Jun.

Abstract

Integrase (IN) is one of only three enzymes encoded in the genomes of all retroviruses, and is the one least characterized in structural terms. IN catalyzes processing of the ends of a DNA copy of the retroviral genome and its concerted insertion into the chromosome of the host cell. The protein consists of three domains, the central catalytic core domain flanked by the N-terminal and C-terminal domains, the latter being involved in DNA binding. Although the Protein Data Bank contains a number of NMR structures of the N-terminal and C-terminal domains of HIV-1 and HIV-2, simian immunodeficiency virus and avian sarcoma virus IN, as well as X-ray structures of the core domain of HIV-1, avian sarcoma virus and foamy virus IN, plus several models of two-domain constructs, no structure of the complete molecule of retroviral IN has been solved to date. Although no experimental structures of IN complexed with the DNA substrates are at hand, the catalytic mechanism of IN is well understood by analogy with other nucleotidyl transferases, and a variety of models of the oligomeric integration complexes have been proposed. In this review, we present the current state of knowledge resulting from structural studies of IN from several retroviruses. We also attempt to reconcile the differences between the reported structures, and discuss the relationship between the structure and function of this enzyme, which is an important, although so far rather poorly exploited, target for designing drugs against HIV-1 infection.

PubMed Disclaimer

Figures

Figure 1
Figure 1. A schematic representation of the reaction catalyzed by retroviral IN during an infection cycle
This example shows the activity of HIV-1 IN. The reaction catalyzed by enzymes from other retroviruses may differ in some details, but the general scheme is the same. In the processing step (A→B), the 3′ ends of viral DNA (colored molecule) are nicked (arrowheads) before the phosphate group (diamond) of the conserved terminal GT dinucleotide (colored beads, A=yellow, C=blue, G=green, T=red), leading to a DNA molecule with a 5′ overhang and a free 3′ OH group on each strand. In the joining step (B→C), host DNA (black) is nicked with a five-nucleotide stagger (vertical bars) on the two strands, and the free 3′ ends of the viral substrate are joined to both host strands, preserving DNA polarity. Panels D and E are equivalent to C and are presented to illustrate the topology of the final DNA product (not shown), which is created from molecule E by cellular DNA repair enzymes, which remove the overhanging viral 5′ dinucleotides and seal the gaps on both sides of the integrated viral DNA. In the final product, the viral insert is flanked by the repeated stagger sequence, and begins with the conserved TG sequence at each 5′ end.
Figure 2
Figure 2. Amino-acid sequence alignment of retroviral integrases
The secondary structure of HIV-1 IN is shown below the sequences (α helices marked as cylinders, β strands as arrows). Symbols:
Figure 3
Figure 3. The structures of the monomers of individual domains of HIV-1 IN
(A) The NTD domain (blue) with a Zn2+ cation (large sphere) coordinated (thin lines) by an HHCC motif (ball-and-stick) of a helix-turn-helix (HTH) fold, is represented by the NMR structure 1WJC [75]. (B) The CCD domain (green), shown with the D,D(35)E catalytic residues (ball-and-stick), a magnesium cation (large sphere) coordinated in site I and the flexible active-site loop highlighted in gray, is represented by the crystal structure 1BL3 [49]. The finger loop (red) extrudes from the body of the protein on the right, between helices α5 and α6 (C terminus). (C) The CTD domain (red) is represented by the NMR structure 1IHV [80]. This and all subsequent figures were prepared with PyMOL [107].
Figure 4
Figure 4. The active site of retroviral integrases
The figures show, in stereoview, the three essential acids of the D,D(35)E motif in selected, least-squares-superposed crystallographic structures of the catalytic core domain in (A) unliganded and (B) Mg-complexed form. The catalytic residues are shown in the context of the protein secondary structure by which they are contributed, namely an extended β-ribbon (the first aspartate, middle of figure), a loop (the second aspartate, left), and an α-helix (the glutamate, right). The residue numbering D64/D116/E152 is for the HIV-1 IN sequence, and corresponds to D64/D121/E157 in ASV IN. The three divalent-metal-cation-free active sites shown in (A) correspond to the first HIV-1 IN structure (1ITG, orange) [40], solved in the presence of arsenic (part of cacodylate buffer), which reacted with cysteine residues, including one within the active site area (orange sphere), to another medium-resolution structure of HIV-1 IN (1BI4, molecule C, gray with red O atoms) [49], and to atomic-resolution structure of ASV IN (1CXQ, green) [57]. Note that the aspartates in 1ITG have a completely different orientation than in the remaining structures and the entire Asp116 loop has a different, non-native conformation. Another symptom of active-site disruption in the 1ITG structure is the absence in the model of the Glu152 residue, a consequence of disorder in this helical segment. The active sites complexed with the catalytic cofactor Mg2+ (large sphere) are shown (B) for HIV-1 IN, 1BL3 (molecule C, gray with red O atoms) [49], ASV IN, 1VSD (green) [53], and for PFV IN, molecule A of 3DLR (orange) [58]. The structure of the ASV IN has the highest resolution, and its quality is reflected in the nearly ideal octahedral geometry (thin green lines) of the Mg2+ coordination sphere, which in addition to interactions with the carboxylate groups of both active-site aspartates includes four precisely defined water molecules. The coordination geometry of the HIV-1 IN complex 1BL3 is significantly distorted. The view direction in both figures is similar, with a small rotation around the horizontal axis.
Figure 5
Figure 5. Small-molecule inhibitors of the CCD domain of retroviral IN
(A) Chemical diagrams of selected inhibitors discussed in this review. (B) A dimer of the CCD domains (colored silver and gold) of HIV-1 IN shown in surface representation roughly down its two-fold axis. The two active sites are marked by the Mg2+ cations (gray spheres), with their octahedral coordination spheres formed by the carboxylates of Asp64 and Asp116, and by four water molecules (red spheres). Note that the active sites are located in shallow depressions on the surface of the protein, with the magnesium cations completely exposed to solvent. Next to the active site, a long grove is running on the surface of the protein. In this structure with the PDB code 1QS4 [43], one of the active site groves is occupied by the 5-CITEP inhibitor, depicted here in ball-and-stick representations with C/N/O/Cl atoms shown in orange/blue/red/green. The two active sites are separated by 40.4 Å, as measured by the distance between the Mg2+ centers.
Figure 6
Figure 6. Three-dimensional structures of dimeric two-domain constructs of HIV integrase determined by X-ray crystallography
(A) In the NTD-CCD structure (1K6Y, [44]), the linker between the domains is disordered and the speculative NTD (blue) - CCD (green) pairing (dashed line) has been proposed from indirect reasoning, such as the existence of contacts between the NTD domains. (B) Mutual orientation of the NTD (blue) and CCD (green) domains as found experimentally in the structure 3F9K of HIV-2 IN [59]. The linker, visible in molecule A, is shown in black. (C) This stereoview has been constructed by least-squares superposition of the CCD domains of molecules A (red) and B (orange) from the 2:1 complex of HIV-2 IN (3F9K) with the Integrase Binding Domain (IBD) of the LEDGF protein (molecule C, gray), onto the CCDs of molecules A (limon) and B (green) of HIV-1 IN (1K6Y). Note that the smaller, all-helical NTDs of the HIV-1 protein are lifted above (in this view, shooting to the right) the CCD domains, while in the model of HIV-2 IN they “fold back” and adhere to the sides of the CCD dimer. The linkers connecting the NTD and CCD are not present in any of the experimental models shown in this figure, except in molecule A (red) of 3F9K, for which clear electron density allowed to unambiguously connect the domains. The asymmetric unit of the 1K6Y structure contains another NTD-CCD dimer, here represented in shades of blue. Note that the blue NTD (of molecule D) superposes exactly on the NTD of molecule B (orange) of the HIV-2 NTD-CCD dimer. This unexpected match is a strong indication that, with missing experimental evidence, the pairing of the NTD and CCD domains in the 1K6Y structure does not correspond to the functional conformation of the protein. In a fashion similar to the “blue” NTD (chain D) – “green” CCD (chain B) pairing, also the “green” NTD (chain B) and “blue” CCD (chain D) domains of HIV-1 integrase can be assembled. By generating a symmetry-related copy of the “limon” (chain A) NTD domain (upper yellow), one can complete the entire NTD-CCD dimer of HIV-1 integrase with the blue catalytic cores. Likewise, a crystallographic copy of the NTD domain attributed to chain C (bottom yellow), will complete the HIV-1 integrase with the green catalytic cores. To guide the eye in this complicated view, the missing connections between the NTD and CCD domains have been generated by copying the linker chain from molecule A of the 3F9K model and grafting it into the remaining molecules (black Cα traces). In this way, four functional HIV NTD-CCD molecules, assembled into two dimers, have been generated. (D) In the CCD-CTD dimer (1EX4, [45]), the interdomain linker forms a long helix. Because of different degree of deformation of this helix, the relative orientation between the CCD (green) and CTD (red) domains in the two monomers is different. All the NTD-CCD dimers considered in this figure have essentially a common two-fold axis for both domains. This is not true for the CCD-CTD construct (D), where the two-fold axes relating the CCD and CTD domains are not identical.
Figure 6
Figure 6. Three-dimensional structures of dimeric two-domain constructs of HIV integrase determined by X-ray crystallography
(A) In the NTD-CCD structure (1K6Y, [44]), the linker between the domains is disordered and the speculative NTD (blue) - CCD (green) pairing (dashed line) has been proposed from indirect reasoning, such as the existence of contacts between the NTD domains. (B) Mutual orientation of the NTD (blue) and CCD (green) domains as found experimentally in the structure 3F9K of HIV-2 IN [59]. The linker, visible in molecule A, is shown in black. (C) This stereoview has been constructed by least-squares superposition of the CCD domains of molecules A (red) and B (orange) from the 2:1 complex of HIV-2 IN (3F9K) with the Integrase Binding Domain (IBD) of the LEDGF protein (molecule C, gray), onto the CCDs of molecules A (limon) and B (green) of HIV-1 IN (1K6Y). Note that the smaller, all-helical NTDs of the HIV-1 protein are lifted above (in this view, shooting to the right) the CCD domains, while in the model of HIV-2 IN they “fold back” and adhere to the sides of the CCD dimer. The linkers connecting the NTD and CCD are not present in any of the experimental models shown in this figure, except in molecule A (red) of 3F9K, for which clear electron density allowed to unambiguously connect the domains. The asymmetric unit of the 1K6Y structure contains another NTD-CCD dimer, here represented in shades of blue. Note that the blue NTD (of molecule D) superposes exactly on the NTD of molecule B (orange) of the HIV-2 NTD-CCD dimer. This unexpected match is a strong indication that, with missing experimental evidence, the pairing of the NTD and CCD domains in the 1K6Y structure does not correspond to the functional conformation of the protein. In a fashion similar to the “blue” NTD (chain D) – “green” CCD (chain B) pairing, also the “green” NTD (chain B) and “blue” CCD (chain D) domains of HIV-1 integrase can be assembled. By generating a symmetry-related copy of the “limon” (chain A) NTD domain (upper yellow), one can complete the entire NTD-CCD dimer of HIV-1 integrase with the blue catalytic cores. Likewise, a crystallographic copy of the NTD domain attributed to chain C (bottom yellow), will complete the HIV-1 integrase with the green catalytic cores. To guide the eye in this complicated view, the missing connections between the NTD and CCD domains have been generated by copying the linker chain from molecule A of the 3F9K model and grafting it into the remaining molecules (black Cα traces). In this way, four functional HIV NTD-CCD molecules, assembled into two dimers, have been generated. (D) In the CCD-CTD dimer (1EX4, [45]), the interdomain linker forms a long helix. Because of different degree of deformation of this helix, the relative orientation between the CCD (green) and CTD (red) domains in the two monomers is different. All the NTD-CCD dimers considered in this figure have essentially a common two-fold axis for both domains. This is not true for the CCD-CTD construct (D), where the two-fold axes relating the CCD and CTD domains are not identical.
Figure 7
Figure 7. Stereoview of the NTD docking site at the “finger” structure of the CCD domain
The NTD (blue) is shown with its CCD domain, as found in the LEDGF complex of the HIV-2 protein (PDB code 3F9K, chain A) [59]. The “finger” loop (red), which is located between helices α5 (green) and α6 (light green) of the CCD domain, forms five hydrogen bonds/salt bridges (broken line) with the NTD domain, and another one with the linker peptide (light blue) connecting the domains. One of those interactions would require a flip of the side-chain amide group of Asn18, near the entry to an α helix (upper left) of the NTD domain. However, such a flip would create another impossible NH…HN interaction (orange) at the N-terminus of this helix. The tip of the finger loop is occupied by an isoleucine residue (green ball-and-stick model).
Figure 8
Figure 8. Stereoview of a structural superposition of several two-domain constructs of retroviral integrases
The superpositions were calculated using only the Cα atoms of the CCD domain (bottom) to show possible mutual orientations of all three domains. Until the structure of intact IN is determined experimentally, this is the best approximation of the three-dimensional model of the enzyme, here shown only for the monomeric molecule. According to available data on dimeric structure of IN domains, a homodimer of IN could be created by rotating the above model by 180° around the vertical line and placing it face-to-face with the original copy, so as to recreate the dimeric interface at the flat face (back of the view) of the CCD domain. The figure uses the following color code: red and orange, molecules A and B of the HIV-1 NTD-CCD protein 1K6Y [44]; salmon, molecule A of the HIV-2 NTD-CCD protein 3F9K [59]; blue, ASV CCD-CTD protein 1C0M [60]; dark/light green, molecules A and B of the HIV-1 CCD-CTD protein 1EX4 [45]; yellow, SIV CCD-CTD protein 1C6V [46]. In the 1C6V structure, the domains that are displayed (D and X) were not interpreted as a single molecule in the original publication. It is evident that the CTD, as represented in this figure (yellow-green-blue colors), covers a wide angular range of its disposition relative to CCD. Note that the NTDs from the 1K6Y structure (red and orange) were linked by the authors to the CCDs without experimental evidence. In a different interpretation of the 1K6Y crystal structure, it is possible to select an NTD partner for the CCD that essentially superposes on the salmon model from the 3F9K structure (see Fig. 6C), which occupies this conformation without any ambiguity as it can be traced via an uninterrupted connection to the CCD domain. It is thus very likely that in variance to the CTD domain, the NTD domain has a fixed orientation relative to CCD.
Figure 9
Figure 9. Divalent metal cation binding sites in integrases
Comparison of the two metal sites found in Cd2+ complex of ASV IN (PDB code 1VSJ, color code according to element type) [54] and in Mg2+/RNA/DNA complex of RNase H (1ZBL, molecule A, orange) [103]. The active sites were superposed by a simple least-squares fit of the metal sites and the carboxylate O atoms of the bridging aspartate residue, which in ASV IN is the first element (Asp64) of the D,D(35)E active site. The metal…metal distance in both structures is nearly identical, 4.10 Å in the RNase H complex and 4.05 Å in the IN complex. Site I, denoted A in [103,108], which in retroviral integrase structures was also seen with the catalytic Mg2+ or Mn2+ cations, has a more regular octahedral coordination. The coordination spheres I in the two structures are similar, except that two ligands in the equatorial plane, an aspartate (Asp121 in ASV IN) Oδ atom and a water molecule, have swapped places (in 1ZBL, the aspartate in question has been mutated to asparagine, D192N). Another important difference is that the bridging water molecule in the ASV IN complex is replaced in the RNase H complex by an O atom from the scissile phosphate group (P) of the RNA substrate. The phosphate group has a particularly important role in the formation of site II, as it provides two of the ligands. Site II (denoted B in [103,108]) has a far less regular geometry; the coordination sphere is incomplete in 1VSJ or highly distorted in 1ZBL. Site II has never been seen occupied by Mg2+ or Mn2+ in metal complex structures of retroviral INs, and the glutamic acid which takes part in its formation (Glu157 in ASV IN) is the most mobile element of the IN active site. From this comparison, it is very likely that a proper site II of retroviral integrase, occupied by a catalytically-competent metal cation (Mg2+ or Mn2+), could only be formed with the participation of a DNA substrate.

References

    1. Coffin JM, Hughes SH, Varmus HE. Retroviruses. Cold Spring Harbor Laboratory Press; New York: 1997. - PubMed
    1. Grunewald K, Cyrklaff M. Structure of complex viruses and virus-infected cells by electron cryo tomography. Curr Opin Microbiol. 2006;9:437–442. - PubMed
    1. Frankel AD, Young JA. HIV-1: fifteen proteins and an RNA. Annu Rev Biochem. 1998;67:1–25. - PubMed
    1. Katz RA, Skalka AM. The retroviral enzymes. Annu Rev Biochem. 1994;63:133–173. - PubMed
    1. Wlodawer A, Vondrasek J. Inhibitors of HIV-1 protease: a major success of structure-assisted drug design. Annu Rev Biophys Biomol Struct. 1998;27:249–284. - PubMed

Publication types

MeSH terms