Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Review
. 2015 Mar;72(6):1045-58.
doi: 10.1007/s00018-014-1779-9. Epub 2014 Nov 29.

Structural mechanisms of RNA recognition: sequence-specific and non-specific RNA-binding proteins and the Cas9-RNA-DNA complex

Affiliations
Review

Structural mechanisms of RNA recognition: sequence-specific and non-specific RNA-binding proteins and the Cas9-RNA-DNA complex

Ting Ban et al. Cell Mol Life Sci. 2015 Mar.

Abstract

RNA-binding proteins play crucial roles in RNA processing and function as regulators of gene expression. Recent studies have defined the structural basis for RNA recognition by diverse RNA-binding motifs. While many RNA-binding proteins recognize RNA sequence non-specifically by associating with 5' or 3' RNA ends, sequence-specific recognition by RNA-binding proteins is typically achieved by combining multiple modular domains to form complex binding surfaces. In this review, we present examples of structures from different classes of RNA-binding proteins, identify the mechanisms utilized by them to target specific RNAs, and describe structural principles of how protein-protein interactions affect RNA recognition specificity. We also highlight the structural mechanism of sequence-dependent and -independent interactions in the Cas9-RNA-DNA complex.

PubMed Disclaimer

Figures

Fig. 1
Fig. 1
RBPs can bind target RNAs in a non-sequence specific manner, which is dependent on recognition of marker groups at the 5′ and 3′ ends of target RNA molecules. a Overall structural views of IFIT5 in absence of RNA presented as cartoon models (PDB accession code: 4HOQ). b Close-up view of the residues making contacts with the RNA 5′-triphosphate group (PDB accession code: 4HOR). The three phosphates are labeled as α, β, γ; RNA and the amino acids that interact with the triphosphate group are presented as stick models. The 5′ nucleotide (N1) is shown in stick representation, with carbon atoms in pink, phosphor atoms in orange, nitrogens in blue and oxygens in red. c Overall structure of hAgo2 in complex with miR-20a (PDB accession code: 4F3T). The Mid domain is colored in red, the PIWI domain is dark blue, the PAZ domain is magenta and the N terminal domain is light blue. The 5′ end of miR-20a is trapped at the interface of the Mid domain and PIWI domain, and the 3′ end of miR-20a is bound to the PAZ domain. RNA is represented as stick models. d Close-up view of the interactions of the first miR-20a base (U1) and the terminal monophosphate with hAgo2. Interacting residues are shown in stick representation with carbons in pink, nitrogens in blue and oxygens in red. The RNA is shown as stick model, with carbons in yellow and phosphors in orange. The Mid domain is shown in gray and the PIWI domain is dark blue. Hydrogen bond and salt bridge interactions are indicated by black dashed lines
Fig. 2
Fig. 2
P-Class PPR proteins bind specific RNA nucleotides via the combinatorial action of two amino acids in each PPR repeat. a Overall structure of apo-PPR10 shown in cartoon representation (PDB accession code: 4M57). b Structure of the PPR10-PSAJ RNA complex (PDB accession code: 4M59). Two PPR10 monomers are shown in cyan and pink, respectively. Single-stranded RNAs are shown as stick models and colored in magenta. c Surface representation of the binding pocket for ssRNA binding, with the region contributing to RNA binding shown in magenta. d The eight nucleotides at the 5′ end of the PSAJ RNA segment are specifically recognized by PPR10 in a modular fashion. RNA is shown as stick and cartoon representation with carbons in yellow, nitrogens in blue, and oxygens in red. The RNA backbone is shown in orange and PPR repeats in cyan. e Recognition of the bases U15 and U16 by PPR10 follows the “binding code”. The 5th residues of repeats 16 (N635) and 17 (N671) contact U15 and U16 through hydrogen bonds that are indicated by black dashed lines. RNA is shown in stick representation and colored the same as in (d). Residues that define the specificity of RNA recognition are presented as stick, with carbons in white, nitrogens in blue and oxygens in red
Fig. 3
Fig. 3
The THA8 homolog recognizes target RNAs through formation of a dimer or oligomer. a Superposition of apo-THA8 (magenta, PDB accession code: 4ME2) and THA8L (green, PDB code: 4LEU) structures in cartoon representation. b The structure of THA8 in complex with a 13-nucleotide Zm-4 RNA shown in cartoon representation, with the two monomers colored in green and magenta (PDB accession codes: 4N2Q). The bound Zm-4 RNA fragment (AGAAA) is shown as stick model at the dimer interface. c Surface charge distribution of the two different sides of the THA8 dimer. The bound RNA fragment is shown as stick model. d Close-up view of the THA8-dimer interactions with the G nucleotide of the AGAAA motif. The carbon atoms are colored in green and magenta for two monomers that interact with Zm4 RNA. Residues that interact with RNA are shown in stick representation, using the same color as the PPR repeat to which they belong. The G nucleotide is shown in stick representation, with carbons in white, nitrogens in blue, and oxygens in red. Hydrogen-bond interactions are indicated by black dashed lines. e A model for the regulation of RNA recognition by short PPR proteins. The single-stranded pre-mRNA induces the oligomerization of PPR proteins, thus bringing several discontinuous regions of RNA into proximity to be recognized by other splicing factors to facilitate the alternative splicing of introns. PPR proteins are presented as blue and orange ovals; SF splicing factors that are recruited by mRNA are colored individually, G G nucleotides that are responsible for RNA recognition
Fig. 4
Fig. 4
Residues at positions 12 and 16 in each PUF repeat contribute to the recognition of specific RNA bases. a Side view of the overall structure of the human Pumilio1 PUF domain bound to NRE2-10 RNA (PDB accession code: 1M8Y). Pum1 is presented as cartoon model and RNA nucleotides are shown in stick representation, with carbons in white, nitrogens in blue, and oxygens in red. b, Enlarged view of recognition of RNA bases of NRE2-10 RNA by the PUF repeats. Residues at positions 12 and 16 from each PUF repeat form hydrogen bonds with one RNA base (shown as black dashed lines) and residue at position 13 stack with RNA bases. PUF repeats are colored in green, with residues that define the specific RNA recognition shown as stick models. c Close-up view of the 12th and 16th residues of PUF repeat 3 contacting the base of the A8 nucleotide through hydrogen bonds. Hydrogen bonds with Q939 and C935 are indicated with black dashed lines. A8 stacks between R936 and H972. d Y1123 and H1159 from repeat 8 form stacking interactions with the uracil base (U3). N1122 (white) and Q1126 (blue) form hydrogen bonds with the uracil base. Hydrogen bonds are indicated by black dashed lines
Fig. 5
Fig. 5
The structural basis for RNA binding by the TZF RBD. a Structure of the TIS11d/RNA complex (PDB accession code: 1RGO). Each Zn2+ finger domain (blue) is bound to one “UAUU” subsite. The RNA is shown in stick representation with carbons shown in yellow, nitrogens in blue, oxygens in red, and phosphorous in orange. The Zn2+ ions are presented as pink spheres. b Superposition of two finger domains of TIS11d, colored in green and magenta. c Close-up view of the interactions between finger 1 and the U6 and A7 nucleotide bases. Residues contacting nucleotides are shown as stick model with carbons in blue; RNA is presented as stick model using the same color code as in (a). Hydrogen bonds are indicated as black dashed lines
Fig. 6
Fig. 6
Three subclasses of RRMs use different mechanisms for RNA binding. a Structure of hnRNP A1 RRM1 bound to single-stranded telomeric DNA (PDB accession code: 2up1). b Structure of hnRNP F qRRM2 bound to 5′-AGGGAU-3′ RNA (PDB accession code: 2KG0). c Structure of the SRSF1 pseudo-RRM bound to 5′-AGGAC-3′ RNA (PDB accession code: 2M8D). DNA and RNAs are shown in stick representations with carbons in yellow, nitrogens in blue, oxygens in red, and phosphorous in orange. RRM motifs are shown as cartoon models in cyan. d Close-up view of interactions of the SRSF1 pseudo-RRM bound to the GG dinucleotide of the 5′-UGAAGGAC-3′ RNA. e Close-up view of the structure of the SRSF1 pseudo-RRM bound to the Trp-Gly-His tripeptide of SRPK1 (PDB accession code: 3BEG). The side-chains of W88 and H90 occupy the same sites as G6 and G5, respectively. Both the G6 base and the side-chains of W88 and H90 could interact with SRSF1 via hydrogen bond formation. The RRMs are shown in gray with residues interacting with the RNA or peptide presented as stick models in dark blue. The Trp-Gly-His tripeptide is presented as stick model in green, with hydrogen bonds are indicated as black dashed lines
Fig. 7
Fig. 7
Structural mechanism of sgRNA:target DNA recognition by CRISPR-associated endonuclease Cas9. a Overall structure of the Cas9-sgRNA-DNA ternary complex (PDB accession code: 4UN3). b Structure of the sgRNA: target DNA complex. The sgRNA, target DNA strand, and non-target DNA strand are colored red, blue, and black, respectively. c Close-up view of the interaction between the sgRNA guide region and the conserved arginine cluster of the bridge helix; U16-R447 and G18-R71 interactions define the specificity of RNA recognition by Cas9. d, e Sequence-dependent interactions between Cas9 and the repeat:anti-repeat duplex. The RNA bases of U23/A49 and A42/G43 form hydrogen bonds with the side chain of R1122 and the main-chain carbonyl group of F351, respectively; the base of the flipped U44 is trapped between Y325 and H328 mediated by stacking interactions, and D364 interact with the nucleobase of the unpaired G43 by forming hydrogen bonds. f, g Close-up view of interactions between Cas9 and stem loops 1 and 2 of the sgRNA, respectively. The BH domain is colored in green, the RuvC domain in magenta, and the C-terminal domain in yellow. RNAs and the residues that are responsible for recognition are shown in stick representation; hydrogen bonds are indicated as black dashed lines

References

    1. Abbas YM, Pichlmair A, Gorna MW, Superti-Furga G, Nagar B. Structural basis for viral 5′-PPP-RNA recognition by human IFIT proteins. Nature. 2013;494:60–64. doi: 10.1038/nature11783. - DOI - PMC - PubMed
    1. Abbasi N, Park YI, Choi SB. Pumilio Puf domain RNA-binding proteins in Arabidopsis. Plant Signal Behav. 2011;6:364–368. doi: 10.4161/psb.6.3.14380. - DOI - PMC - PubMed
    1. Abil Z, Denard CA, Zhao H. Modular assembly of designer PUF proteins for specific post-transcriptional regulation of endogenous RNA. J Biol Eng. 2014;8:7. doi: 10.1186/1754-1611-8-7. - DOI - PMC - PubMed
    1. Anders C, Niewoehner O, Duerst A, Jinek M. Structural basis of PAM-dependent target DNA recognition by the Cas9 endonuclease. Nature. 2014;513:569–573. doi: 10.1038/nature13579. - DOI - PMC - PubMed
    1. Ban T, Ke J, Chen R, Gu X, Tan MH, Zhou XE, Kang Y, Melcher K, Zhu JK, Xu HE. Structure of a PLS-class pentatricopeptide repeat protein provides insights into mechanism of RNA recognition. J Biol Chem. 2013;288:31540–31548. doi: 10.1074/jbc.M113.496828. - DOI - PMC - PubMed

LinkOut - more resources