Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2012;8(8):e1002910.
doi: 10.1371/journal.pgen.1002910. Epub 2012 Aug 16.

A combinatorial amino acid code for RNA recognition by pentatricopeptide repeat proteins

Affiliations

A combinatorial amino acid code for RNA recognition by pentatricopeptide repeat proteins

Alice Barkan et al. PLoS Genet. 2012.

Abstract

The pentatricopeptide repeat (PPR) is a helical repeat motif found in an exceptionally large family of RNA-binding proteins that functions in mitochondrial and chloroplast gene expression. PPR proteins harbor between 2 and 30 repeats and typically bind single-stranded RNA in a sequence-specific fashion. However, the basis for sequence-specific RNA recognition by PPR tracts has been unknown. We used computational methods to infer a code for nucleotide recognition involving two amino acids in each repeat, and we validated this model by recoding a PPR protein to bind novel RNA sequences in vitro. Our results show that PPR tracts bind RNA via a modular recognition mechanism that differs from previously described RNA-protein recognition modes and that underpins a natural library of specific protein/RNA partners of unprecedented size and diversity. These findings provide a significant step toward the prediction of native binding sites of the enormous number of PPR proteins found in nature. Furthermore, the extraordinary evolutionary plasticity of the PPR family suggests that the PPR scaffold will be particularly amenable to redesign for new sequence specificities and functions.

PubMed Disclaimer

Conflict of interest statement

The authors have submitted a provisional patent application that is based on this work. In addition, the authors have grant funding that supports this research.

Figures

Figure 1
Figure 1. Sedimentation Velocity Analytical Ultracentrifugation of rPPR10 and rPPR10/RNA Complexes.
(A) SV-AUC analysis of rPPR10 at 3, 6, and 12 µM. (B) SV-AUC analysis of rPPR10 (3 µM) in the presence of its 17-nt minimal RNA ligand (1.5 µM or 3 µM). The assignment of the two species at ∼5S in the top panel as either PPR10 monomer or PPR10/RNA is ambiguous, as variation in apparent S value can result when multiple species of similar abundance are in equilibrium. The root-mean-squared-deviations ranged between .007 and .013. The trace species at low S values may result from contaminating MBP and TEV protease, whereas those of larger size may represent higher order PPR10 oligomers.
Figure 2
Figure 2. Alignments between PPR Proteins and Cognate Binding Sites.
(A) Statistically optimal alignments between amino acids at positions 6 (blue) and 1′ (red) in PPR10's PPR motifs and its RNA ligands (italics). PPR10's in vivo footprints are shown at top; the box marks the minimal binding site defined in vitro. Dark green shading indicates experimentally validated matches (Figure 5). Light green shading indicates significant correlation between position 6 and the purine/pyrimidine class of the matched nucleotide (Table S3). Magenta shading indicates significant anti-correlation between position 6 and the purine/pyrimidine class of the matched nucleotide (Table S3). Compensatory changes in orthologous protein/RNA pairs are indicated with a star. The PPR motifs are ordered from N to C terminus in the protein, and nucleotides are ordered from 5′ to 3′ in the RNA. The same schemes apply to panels (C) and (D). (B) Structural model illustrating physical plausibility of the cooperation between amino acids at positions 6 and 1′ in nucleotide specification. The model of the PPR10-atpH RNA complex was produced using distance geometry methods as previously described . RNA bases were constrained to be within 3 Å of residues 6 and 1′ of helices A and A′ of adjacent motifs. Each PPR motif consists of one “A” and one “B” helix, as marked. (C) Alignments between amino acids at positions 6 and 1′ in PPR motifs of HCF152 and CRP1 and their RNA ligands. The psbH-petB sequence is HCF152's in vivo footprint , within which HCF152 binds in vitro . The petB-petD sequence is a CRP1-dependent in vivo footprint . The psaC sequence maps within the 70-nt region that most strongly coimmunoprecipitates with CRP1 . (D) Alignments between amino acids at positions 6 and 1′ in PPR motifs of the RNA editing factors OTP82, CRR22 and CRR4 and their RNA targets , . Minimal binding sites determined in vitro are boxed. The edited C (magenta) is the last nucleotide in each case. The type of PPR motif, either P, L or S, is indicated above. Only matches involving P or S motifs are shaded, as L motifs cannot be accommodated within the code developed here.
Figure 3
Figure 3. Amino Acid Representation at Each Position of PPR Motifs that Align with A, G, C, or U Bases.
Motif pairs from PPR10, HCF152, CRP1 and 37 RNA editing factors flanking the indicated nucleotide were used to construct sequence logos . Each logo shows the first fifteen positions of the P-type motif containing position 6, a gap, and then the first 5 positions of the following motif. 74, 48, 96 and 126 motif pairs were used to generate the A, G, C and U logos, respectively. The alignments used to generate the logos are shown in Figure S1.
Figure 4
Figure 4. Nucleotides That Align with the Most Frequent Combinations of Amino Acids at Positions 6 and 1′.
Nucleotides aligned with each 6/1′ combination in the alignments in Figure S1 were used to construct sequence logos . Only P motifs were used in this analysis. Each logo shows the aligned nucleotide (0) and the preceding (−1) and succeeding (+1) nucleotides. 25, 23, 102, 86 and 16 alignments were used to generate the T6N1′, T6D1′, N6D1′, N6N1′ and N6S1′ logos, respectively.
Figure 5
Figure 5. Gel Mobility Shift Assays Validating Amino Acid Codes for Specifying PPR Binding to A, G, C, or U.
(A) Summary of rPPR10 variants. The same amino acids at positions 6 and 1′ were introduced into the sixth and seventh PPR motifs in PPR10, whose wild-type sequences are shown above. The RNAs used for binding assays are shown below. (B) Gel mobility shift assays with the wild-type RNA, or variants with nucleotides four and five substituted with either GG, AA, UU, or CC. (C) Binding curves of the NN, ND, and NS PPR10 variants with the UU and CC substituted RNAs.

References

    1. Boch J, Scholze H, Schornack S, Landgraf A, Hahn S, et al. (2009) Breaking the code of DNA binding specificity of TAL-type III effectors. Science 326: 1509–1512. - PubMed
    1. Moscou MJ, Bogdanove AJ (2009) A simple cipher governs DNA recognition by TAL effectors. Science 326: 1501. - PubMed
    1. Lu G, Dolgner SJ, Hall TM (2009) Understanding and engineering RNA sequence specificity of PUF proteins. Curr Opin Struct Biol 19: 110–115. - PMC - PubMed
    1. Cooke A, Prigge A, Opperman L, Wickens M (2011) Targeted translational regulation using the PUF protein family scaffold. Proc Natl Acad Sci U S A 108: 15870–15875. - PMC - PubMed
    1. Dong S, Wang Y, Cassidy-Amstutz C, Lu G, Bigler R, et al. (2011) Specific and modular binding code for cytosine recognition in Pumilio/FBF (PUF) RNA-binding domains. J Biol Chem 286: 26732–26742. - PMC - PubMed

Publication types

MeSH terms