Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2016 Feb 27;428(5 Pt A):748-757.
doi: 10.1016/j.jmb.2015.11.013. Epub 2016 Feb 17.

Principles for Predicting RNA Secondary Structure Design Difficulty

Affiliations

Principles for Predicting RNA Secondary Structure Design Difficulty

Jeff Anderson-Lee et al. J Mol Biol. .

Abstract

Designing RNAs that form specific secondary structures is enabling better understanding and control of living systems through RNA-guided silencing, genome editing and protein organization. Little is known, however, about which RNA secondary structures might be tractable for downstream sequence design, increasing the time and expense of design efforts due to inefficient secondary structure choices. Here, we present insights into specific structural features that increase the difficulty of finding sequences that fold into a target RNA secondary structure, summarizing the design efforts of tens of thousands of human participants and three automated algorithms (RNAInverse, INFO-RNA and RNA-SSD) in the Eterna massive open laboratory. Subsequent tests through three independent RNA design algorithms (NUPACK, DSS-Opt and MODENA) confirmed the hypothesized importance of several features in determining design difficulty, including sequence length, mean stem length, symmetry and specific difficult-to-design motifs such as zigzags. Based on these results, we have compiled an Eterna100 benchmark of 100 secondary structure design challenges that span a large range in design difficulty to help test future efforts. Our in silico results suggest new routes for improving computational RNA design methods and for extending these insights to assess "designability" of single RNA structures, as well as of switches for in vitro and in vivo applications.

Keywords: RNA design; RNA secondary structure; benchmark; citizen science; inverse folding.

PubMed Disclaimer

Figures

Figure 1
Figure 1. Tools used by Eterna community to assess difficulty of RNA secondary structure design puzzles
[A] The puzzlemaker interface allows players to define a secondary structure string or insert bases or base pairs at specified positions and then deploy these puzzles to other players to solve. [B] The puzzle solver interface enables players to select nucleotides and paint them over the structure, which can switch between natural and target modes. [C] After puzzle design, players are able to see which bots are able to solve their puzzles. [D] The Eternascript interface allows for players to create and test their own puzzle solving algorithms.
Figure 2
Figure 2. RNA design puzzles from Eterna demonstrate features that make design difficult
Open or filled squares indicate failure or successful solutions, respectively, by existing RNA design algorithms – RNAinverse (red), INFO-RNA (yellow), RNA-SSD (green), NUPACK (cyan), DSS-Opt (blue), and MODENA (purple). [A] Stem length: Shortie 4, Shortie 6; [B] Adjacent Multiloops: Kyurem 5, Kyurem 7; [C] Loop next to a Multiloop; [D] Bulges: Just down to 1 bulge, 1,2,3 and 4 bulges; [E] Internal Loops: Mat – Lot 2-2 B, Crop Circle 2; [F] Zigzags: Hard Y; [G] Simple puzzles: This is ACTUALLY Small and Easy 6; [H] Quasispecies 2-2 Loop Challenge, Water Strider, The Fractal, Mutated Chicken Feet
Figure 3
Figure 3. A case study for RNA design – Kyurem 7
[A] A near-miss sequence design for Kyurem 7 (see Figure 2B) was designed by a player-created bot and misfolds only in one stem (red). [B] A successful solution with only afew base changes (green). [C] A slight variation in the Kyurem 7 target secondary structure has only minor changes in the lengths of the multiloops, but is much easier to solve due to the availability of low-energy designs for the expanded multiloops. Nucleotides are colored by base, with A in yellow, U in blue, G in red and C in green. Minimum free energy structures and loop energies (in kcal/mol) are based on the Turner 1999 parameters.
Figure 4
Figure 4. An example of a common near-miss fold
A symmetric, multiloop-containing puzzle created by players, in which designs for the target structure (A) frequently mispair to create the same misfolded structure (B).
Figure 5
Figure 5. Performance of existing algorithms on the Eterna100 benchmark
Six RNA design algorithms were evaluated using the Eterna100 benchmark. [A] The successes (green) and failures (red) are shown for each algorithm. The puzzles are ordered by the number of successful solvers on Eterna, from fewest to most, and the algorithms are labeled with the number of puzzles solved. [B] The amount of time required to reach a solution for puzzles of different lengths is shown for each algorithm. Lines show the median values over each bin of lengths.

References

    1. Xia T, SantaLucia J, Burkard ME, Kierzek R, Schroeder SJ, Jiao X, et al. Thermodynamic parameters for an expanded nearest-neighbor model for formation of RNA duplexes with Watson-Crick base pairs. Biochemistry. 1998;37:14719–35. doi: 10.1021/bi9809425. - DOI - PubMed
    1. Hofacker IL, Fontana W, Stadler PF, Bonhoeffer LS, Tacker M, Schuster P. Fast folding and comparison of RNA secondary structures. Monatshefte Für Chemie Chem Mon. 1994;125:167–88. doi: 10.1007/BF00818163. - DOI
    1. Patzel V, Rutz S, Dietrich I, Köberle C, Scheffold A, Kaufmann SHE. Design of siRNAs producing unstructured guide-RNAs results in improved RNA interference efficiency. Nat Biotechnol. 2005;23:1440–4. doi: 10.1038/nbt1151. - DOI - PubMed
    1. Jinek M, Jiang F, Taylor DW, Sternberg SH, Kaya E, Ma E, et al. Structures of Cas9 endonucleases reveal RNA-mediated conformational activation. Science. 2014;343:1247997. doi: 10.1126/science.1247997. - DOI - PMC - PubMed
    1. Geary C, Rothemund PWK, Andersen ES. RNA nanostructures. A single-stranded architecture for cotranscriptional folding of RNA nanostructures. Science. 2014;345:799–804. doi: 10.1126/science.1253920. - DOI - PubMed

Publication types