A combinatorial amino acid code for RNA recognition by pentatricopeptide repeat proteins

Alice Barkan¹, Margarita Rojas, Sota Fujii, Aaron Yap, Yee Seng Chong, Charles S Bond, Ian Small

Affiliations

PMID: 22916040
PMCID: PMC3420917
DOI: 10.1371/journal.pgen.1002910

A combinatorial amino acid code for RNA recognition by pentatricopeptide repeat proteins

Alice Barkan et al. PLoS Genet. 2012.

. 2012;8(8):e1002910.

doi: 10.1371/journal.pgen.1002910. Epub 2012 Aug 16.

Authors

Alice Barkan¹, Margarita Rojas, Sota Fujii, Aaron Yap, Yee Seng Chong, Charles S Bond, Ian Small

Affiliation

¹ Institute of Molecular Biology, University of Oregon, Eugene, Oregon, United States of America. abarkan@uoregon.edu

PMID: 22916040
PMCID: PMC3420917
DOI: 10.1371/journal.pgen.1002910

Abstract

The pentatricopeptide repeat (PPR) is a helical repeat motif found in an exceptionally large family of RNA-binding proteins that functions in mitochondrial and chloroplast gene expression. PPR proteins harbor between 2 and 30 repeats and typically bind single-stranded RNA in a sequence-specific fashion. However, the basis for sequence-specific RNA recognition by PPR tracts has been unknown. We used computational methods to infer a code for nucleotide recognition involving two amino acids in each repeat, and we validated this model by recoding a PPR protein to bind novel RNA sequences in vitro. Our results show that PPR tracts bind RNA via a modular recognition mechanism that differs from previously described RNA-protein recognition modes and that underpins a natural library of specific protein/RNA partners of unprecedented size and diversity. These findings provide a significant step toward the prediction of native binding sites of the enormous number of PPR proteins found in nature. Furthermore, the extraordinary evolutionary plasticity of the PPR family suggests that the PPR scaffold will be particularly amenable to redesign for new sequence specificities and functions.

PubMed Disclaimer

Conflict of interest statement

The authors have submitted a provisional patent application that is based on this work. In addition, the authors have grant funding that supports this research.

Figures

**Figure 1. Sedimentation Velocity Analytical Ultracentrifugation of rPPR10 and rPPR10/RNA Complexes.**
(A) SV-AUC analysis of rPPR10 at 3, 6, and 12 µM. (B) SV-AUC analysis of rPPR10 (3 µM) in the presence of its 17-nt minimal RNA ligand (1.5 µM or 3 µM). The assignment of the two species at ∼5S in the top panel as either PPR10 monomer or PPR10/RNA is ambiguous, as variation in apparent S value can result when multiple species of similar abundance are in equilibrium. The root-mean-squared-deviations ranged between .007 and .013. The trace species at low S values may result from contaminating MBP and TEV protease, whereas those of larger size may represent higher order PPR10 oligomers.

**Figure 2. Alignments between PPR Proteins and Cognate Binding Sites.**
(A) Statistically optimal alignments between amino acids at positions 6 (blue) and 1′ (red) in PPR10's PPR motifs and its RNA ligands (italics). PPR10's *in vivo* footprints are shown at top; the box marks the minimal binding site defined *in vitro*. Dark green shading indicates experimentally validated matches (Figure 5). Light green shading indicates significant correlation between position 6 and the purine/pyrimidine class of the matched nucleotide (Table S3). Magenta shading indicates significant anti-correlation between position 6 and the purine/pyrimidine class of the matched nucleotide (Table S3). Compensatory changes in orthologous protein/RNA pairs are indicated with a star. The PPR motifs are ordered from N to C terminus in the protein, and nucleotides are ordered from 5′ to 3′ in the RNA. The same schemes apply to panels (C) and (D). (B) Structural model illustrating physical plausibility of the cooperation between amino acids at positions 6 and 1′ in nucleotide specification. The model of the PPR10-*atpH* RNA complex was produced using distance geometry methods as previously described . RNA bases were constrained to be within 3 Å of residues 6 and 1′ of helices A and A′ of adjacent motifs. Each PPR motif consists of one “A” and one “B” helix, as marked. (C) Alignments between amino acids at positions 6 and 1′ in PPR motifs of HCF152 and CRP1 and their RNA ligands. The *psbH-petB* sequence is HCF152's *in vivo* footprint , within which HCF152 binds *in vitro* . The *petB-petD* sequence is a CRP1-dependent *in vivo* footprint . The *psaC* sequence maps within the 70-nt region that most strongly coimmunoprecipitates with CRP1 . (D) Alignments between amino acids at positions 6 and 1′ in PPR motifs of the RNA editing factors OTP82, CRR22 and CRR4 and their RNA targets , . Minimal binding sites determined *in vitro* are boxed. The edited C (magenta) is the last nucleotide in each case. The type of PPR motif, either P, L or S, is indicated above. Only matches involving P or S motifs are shaded, as L motifs cannot be accommodated within the code developed here.

**Figure 3. Amino Acid Representation at Each Position of PPR Motifs that Align with A, G, C, or U Bases.**
Motif pairs from PPR10, HCF152, CRP1 and 37 RNA editing factors flanking the indicated nucleotide were used to construct sequence logos . Each logo shows the first fifteen positions of the P-type motif containing position 6, a gap, and then the first 5 positions of the following motif. 74, 48, 96 and 126 motif pairs were used to generate the A, G, C and U logos, respectively. The alignments used to generate the logos are shown in Figure S1.

**Figure 4. Nucleotides That Align with the Most Frequent Combinations of Amino Acids at Positions 6 and 1′.**
Nucleotides aligned with each 6/1′ combination in the alignments in Figure S1 were used to construct sequence logos . Only P motifs were used in this analysis. Each logo shows the aligned nucleotide (0) and the preceding (−1) and succeeding (+1) nucleotides. 25, 23, 102, 86 and 16 alignments were used to generate the T₆N_1′, T₆D_1′, N₆D_1′, N₆N_1′ and N₆S_1′ logos, respectively.

**Figure 5. Gel Mobility Shift Assays Validating Amino Acid Codes for Specifying PPR Binding to A, G, C, or U.**
(A) Summary of rPPR10 variants. The same amino acids at positions 6 and 1′ were introduced into the sixth and seventh PPR motifs in PPR10, whose wild-type sequences are shown above. The RNAs used for binding assays are shown below. (B) Gel mobility shift assays with the wild-type RNA, or variants with nucleotides four and five substituted with either GG, AA, UU, or CC. (C) Binding curves of the NN, ND, and NS PPR10 variants with the UU and CC substituted RNAs.

See this image and copyright information in PMC

References

1. Boch J, Scholze H, Schornack S, Landgraf A, Hahn S, et al. (2009) Breaking the code of DNA binding specificity of TAL-type III effectors. Science 326: 1509–1512. - PubMed
1. Moscou MJ, Bogdanove AJ (2009) A simple cipher governs DNA recognition by TAL effectors. Science 326: 1501. - PubMed
1. Lu G, Dolgner SJ, Hall TM (2009) Understanding and engineering RNA sequence specificity of PUF proteins. Curr Opin Struct Biol 19: 110–115. - PMC - PubMed
1. Cooke A, Prigge A, Opperman L, Wickens M (2011) Targeted translational regulation using the PUF protein family scaffold. Proc Natl Acad Sci U S A 108: 15870–15875. - PMC - PubMed
1. Dong S, Wang Y, Cassidy-Amstutz C, Lu G, Bigler R, et al. (2011) Specific and modular binding code for cytosine recognition in Pumilio/FBF (PUF) RNA-binding domains. J Biol Chem 286: 26732–26742. - PMC - PubMed

Publication types

Actions
Actions

MeSH terms

Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions

Substances

Actions
Actions
Actions
Actions

LinkOut - more resources

Full Text Sources
Other Literature Sources
- H1 Connect - Access expert opinions and insights on biomedical research.
- The Lens - Patent Citations Database

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

A combinatorial amino acid code for RNA recognition by pentatricopeptide repeat proteins

Affiliation

A combinatorial amino acid code for RNA recognition by pentatricopeptide repeat proteins

Authors

Affiliation

Abstract

Conflict of interest statement

Figures

References

Publication types

MeSH terms

Substances

LinkOut - more resources

Full Text Sources

Other Literature Sources