Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2009 Aug 6;460(7256):711-6.
doi: 10.1038/nature08237.

Architecture and secondary structure of an entire HIV-1 RNA genome

Affiliations

Architecture and secondary structure of an entire HIV-1 RNA genome

Joseph M Watts et al. Nature. .

Abstract

Single-stranded RNA viruses encompass broad classes of infectious agents and cause the common cold, cancer, AIDS and other serious health threats. Viral replication is regulated at many levels, including the use of conserved genomic RNA structures. Most potential regulatory elements in viral RNA genomes are uncharacterized. Here we report the structure of an entire HIV-1 genome at single nucleotide resolution using SHAPE, a high-throughput RNA analysis technology. The genome encodes protein structure at two levels. In addition to the correspondence between RNA and protein primary sequences, a correlation exists between high levels of RNA structure and sequences that encode inter-domain loops in HIV proteins. This correlation suggests that RNA structure modulates ribosome elongation to promote native protein folding. Some simple genome elements previously shown to be important, including the ribosomal gag-pol frameshift stem-loop, are components of larger RNA motifs. We also identify organizational principles for unstructured RNA regions, including splice site acceptors and hypervariable regions. These results emphasize that the HIV-1 genome and, potentially, many coding RNAs are punctuated by previously unrecognized regulatory motifs and that extensive RNA structure constitutes an important component of the genetic code.

PubMed Disclaimer

Figures

Figure 1
Figure 1
Organization, extent of RNA structure, and relationship to protein structure for an HIV-1 genome. (a) HIV-1 genome organization. Protein coding regions are shown as gray boxes; polyprotein domain junctions are depicted as solid vertical lines. Gene start and end sites are numbered according to NL4-3. (b) Comparison of median SHAPE reactivities (thick blue line) and evolutionary pairing probabilities (cyan line). Medians are calculated using a 75 nt window. The global median (= 0.34) is depicted as a red line. Pairing probability is not reported for regions encoding overlapping reading frames. (c) Inter-protein linkers in polyprotein precursors and the unstructured peptide loops that link protein domains are indicated with green and yellow bars, respectively. The single inter-protein linker that is not encoded by a region of highly structured RNA (at the RNase H-IN junction) is shown with an open green bar. (d) Comparison of domain structures for the large HIV proteins with the structure of the encoding RNA. Polyprotein linkers are green; inter-domain loops are yellow; folded protein domains are blue, red, light magenta, purple, and gray (Table S2).
Figure 2
Figure 2
Structure of the HIV-1 NL4-3 genome. The 5′ and 3′ genome halves are shown in panels (a) and (b). Nucleotides are colored by their absolute SHAPE reactivities (see scale in panel a). Every nucleotide is shown explicitly as a sphere; base pairing is indicated by adjacent parallel orientation of the spheres. Protein domains are identified by letters. Important structural landmarks are labeled explicitly. Full nucleotide identities and pairings are provided in the supplementary information (Fig. S7). Intermolecular base pairs involving the tRNALys3 primer and the genomic dimer are shown in gray. Inset shows a box plot illustrating SHAPE reactivities for single stranded versus paired nucleotides in the model. Median reactivities are indicated by bold horizontal lines; the large box spans the central 50% of the reactivity data.
Figure 3
Figure 3
SHAPE analysis of the signal peptide (SP) gp120 region. (a) Processed capillary electrophoresis trace showing intensity versus position for the (+) and (−) reagent reactions. (b) Histogram of integrated and normalized SHAPE reactivities as a function of nucleotide position. The SHAPE reactivity scale shown here is used consistently throughout this work. (c) RNA secondary structure model for the SP pause site stem. (d) Location of the SP-stem relative to the eukaryotic ribosome at the pause site. Base pairs disrupted when the ribosome is at the pause site are boxed.
Figure 4
Figure 4
RNA structure in Env hypervariable regions. (a) Schematic sequence alignment for group M reference sequences at the Env hypervariable regions (hV1, hV2, hV4 & hV5). Nucleotides are represented as vertical bars; light gray and black indicate low versus universal conservation, respectively. (b) Evolutionary pairing probabilities. Breaks indicate extensive nucleotide insertions and deletions among the group M consensus sequences. (c) RNA structures at the hypervariable coding regions hV1, hV2, hV4, and hV5. Calculated free energies are shown for each helix (in kcal/mol); energies for anchoring helices proposed to function as structural insulators are emphasized in bold. (d) Distribution of helix stabilities in the HIV genome shown in a box blot representation. Whiskers illustrate 1.5 times the interquartile range and circles emphasize helices of exceptionally high stability. Free energy changes for proposed insulating helices are in bold; other significant helices are labeled.

Comment in

References

    1. Cann AJ. Principles of Molecular Virology. Elsevier Academic Press; Amsterdam: 2005.
    1. Coffin JM, Hughes SH, Varmus HE. Retroviruses. Cold Spring Harbor Laboratory Press; Cold Spring Harbor, NY: 1997. - PubMed
    1. Frankel AD, Young JA. HIV-1: Fifteen proteins and an RNA. Annu Rev Biochem. 1998;67:1–25. - PubMed
    1. Damgaard CK, Andersen ES, Knudsen B, Gorodkin J, Kjems J. RNA interactions in the 5′ region of the HIV-1 genome. J Mol Biol. 2004;336:369–379. - PubMed
    1. Goff SP. Host factors exploited by retroviruses. Nature Rev Microbiol. 2007;5:253–263. - PubMed

Publication types

MeSH terms

Substances