. 2019 Jun 27;15(6):e1007059.

doi: 10.1371/journal.pcbi.1007059. eCollection 2019 Jun.

EternaBrain: Automated RNA design through move sets and strategies from an Internet-scale RNA videogame

Rohan V Koodli¹, Benjamin Keep², Katherine R Coppess³, Fernando Portela¹; Eterna participants; Rhiju Das^{1

3}

Affiliations

¹ Department of Biochemistry, Stanford University School of Medicine, Stanford, CA, United States of America.
² Department of Education, Stanford University, Stanford, CA, United States of America.
³ Department of Physics, Stanford University, Stanford, CA, United States of America.

PMID: 31247029
PMCID: PMC6597038
DOI: 10.1371/journal.pcbi.1007059

EternaBrain: Automated RNA design through move sets and strategies from an Internet-scale RNA videogame

Rohan V Koodli et al. PLoS Comput Biol. 2019.

. 2019 Jun 27;15(6):e1007059.

doi: 10.1371/journal.pcbi.1007059. eCollection 2019 Jun.

Authors

Rohan V Koodli¹, Benjamin Keep², Katherine R Coppess³, Fernando Portela¹; Eterna participants; Rhiju Das^{1

3}

Affiliations

¹ Department of Biochemistry, Stanford University School of Medicine, Stanford, CA, United States of America.
² Department of Education, Stanford University, Stanford, CA, United States of America.
³ Department of Physics, Stanford University, Stanford, CA, United States of America.

PMID: 31247029
PMCID: PMC6597038
DOI: 10.1371/journal.pcbi.1007059

Abstract

Emerging RNA-based approaches to disease detection and gene therapy require RNA sequences that fold into specific base-pairing patterns, but computational algorithms generally remain inadequate for these secondary structure design tasks. The Eterna project has crowdsourced RNA design to human video game players in the form of puzzles that reach extraordinary difficulty. Here, we demonstrate that Eterna participants' moves and strategies can be leveraged to improve automated computational RNA design. We present an eternamoves-large repository consisting of 1.8 million of player moves on 12 of the most-played Eterna puzzles as well as an eternamoves-select repository of 30,477 moves from the top 72 players on a select set of more advanced puzzles. On eternamoves-select, we present a multilayer convolutional neural network (CNN) EternaBrain that achieves test accuracies of 51% and 34% in base prediction and location prediction, respectively, suggesting that top players' moves are partially stereotyped. Pipelining this CNN's move predictions with single-action-playout (SAP) of six strategies compiled by human players solves 61 out of 100 independent puzzles in the Eterna100 benchmark. EternaBrain-SAP outperforms previously published RNA design algorithms and achieves similar or better performance than a newer generation of deep learning methods, while being largely orthogonal to these other methods. Our study provides useful lessons for future efforts to achieve human-competitive performance with automated RNA design algorithms.

PubMed Disclaimer

Conflict of interest statement

The authors have declared that no competing interests exist.

Figures

**Fig 1. Eterna and EternaBrain.**
(A-C) Puzzle-solving interface presented to human players of Eterna including the state of the puzzle (whether it is solved or not) in the top left corner (red/green outline), the puzzle itself (in the middle), and the toolbar (bottom) with which the players can mutate the RNA sequence to make it fold into the desired state; yellow, blue, red, and green symbols represent A, U, G, and C nucleotides. (A) The desired target structure for the RNA molecule, as indicated by the bullseye in the bottom left (orange highlight). (B) Nature mode, as indicated by the leaf in the bottom left (orange highlight), gives the predicted minimum free energy structure for the current sequence. Since the bases in the top right should be paired with each other (orange circle), this puzzle is not yet folding correctly; this status is shown by the red indicator in the top left corner. (C) The solved puzzle. The nature-mode structure matches the target structure, and the indicator in the top left corner turns green, meaning the puzzle has been solved. (D) (left) Wide distribution of contributed Eterna solutions across different players. For preparing the *eternamoves-select* data set, we selected any player who had solved more than 3000 distinct puzzles, which left us with 72 players. (right) In EternaBrain, we tested whether information on players’ moves could be used to train a convolutional neural network. (E) For solving new puzzles, the final EternaBrain-SAP framework first uses the EternaBrain convolutional neural net model to predict sequence changes (‘moves’) for new RNA puzzles. In a second stage, the Single Action Playout (SAP), six additional hand-coded strategies are applied to complete the solution.

**Fig 2. The 6 strategies included in the SAP.**
(A) The original state of the puzzle before SAP. This represents a puzzle initiated with an arbitrary sequence of nucleotides; panel displays the target structure, where mismatched nucleotides (C-A) are highlighted. (B) The first step of the SAP is to correct mismatched pairs. Here, the cytosine nucleotides are switched to uracil to pair with adenine. (C) Changing end pairs to G-C. Changing base pairs that are at the edges of stems and flank loops to G-C pairs lowers the free energy of the molecule. (D) G-internal loop boost. The first nucleotide in an internal loop on either side is switched to a guanine. (E) U-G-U-G super boost. In an internal loop with 2 unpaired bases on either side, the 2 bases are changed to uracil and guanine, in that order, on either side. (F) G-hairpin boost. The first nucleotide in each strand of a hairpin loop is changed to a guanine. (G) Reorienting base pairs. Target base pairs that are not predicted to be folded correctly are ‘flipped’ to lower the energy of the structure. Here, alternating the A-U pairs lowers the energy of the stack. The 5’ end of each puzzle is at the top left, with the puzzle drawn counter-clockwise from that point.

**Fig 3. EternaBrain performance.**
(A) Performance of EternaBrain and 6 previously published algorithms on Eterna100 benchmark. EternaBrain solves 61/100, followed by MODENA (54/100), INFO-RNA (50/100), NUPACK (48/100), DSS-Opt (47/100), RNAinverse (28/100), and RNA-SSD (27/100). (B) Performance of Alternative Model Constructions. The CNN alone could solve only 20/100, and the SAP alone could solve 50/100. Removing various input features passed into the CNN resulted in drops in performance, confirming the importance of these features.

**Fig 4. Example EternaBrain-SAP solutions to Eterna100 puzzles.**
(A) U solution highlights the fact that the EternaBrain CNN alone can solve puzzles with short stems. (B) *Chicken Tracks* solution: EternaBrain-SAP can solve puzzles with three stems intersecting in one internal loop. (C) *Thunderbolt* solution demonstrates that EternaBrain-SAP can solve large puzzles (400 nucleotides long) and solve loops and stems in combination. (D) *Shortie 4* solution shows EternaBrain-SAP can solve puzzles with multiple short stems (2 nucleotides long). (E) *Shortie 6* is quite similar to *Shortie 4*, but with the same motif (short stems) repeated. The other algorithms mentioned could not solve *Shortie 6* because of the repeated motifs. (F) *Hard Y*—target structure (left) vs nature-mode (right) structure. EternaBrain-SAP could not solve *Hard Y* because it required use of a little-used strategy to solve a motif called a zigzag. Since the strategy is not often used by players, the EternaBrain CNN did not learn the strategy and the strategy was not included in the SAP. In each panel, the 5’ end of each puzzle is at the top left, with the puzzle drawn counter-clockwise from that point.

See this image and copyright information in PMC

Cited by

Machine Learning for RNA Design: LEARNA.
Runge F, Hutter F. Runge F, et al. Methods Mol Biol. 2025;2847:63-93. doi: 10.1007/978-1-0716-4079-1_5. Methods Mol Biol. 2025. PMID: 39312137
RNA Engineering for Public Health: Innovations in RNA-Based Diagnostics and Therapeutics.
Thavarajah W, Hertz LM, Bushhouse DZ, Archuleta CM, Lucks JB. Thavarajah W, et al. Annu Rev Chem Biomol Eng. 2021 Jun 7;12:263-286. doi: 10.1146/annurev-chembioeng-101420-014055. Epub 2021 Apr 26. Annu Rev Chem Biomol Eng. 2021. PMID: 33900805 Free PMC article.
Authorship and Citizen Science: Seven Heuristic Rules.
Sandin P, Baard P, Bülow W, Helgesson G. Sandin P, et al. Sci Eng Ethics. 2024 Oct 29;30(6):53. doi: 10.1007/s11948-024-00516-x. Sci Eng Ethics. 2024. PMID: 39470965 Free PMC article.
Fast free-energy-based neutral set size estimates for the RNA genotype-phenotype map.
Martin NS, Ahnert SE. Martin NS, et al. J R Soc Interface. 2022 Jun;19(191):20220072. doi: 10.1098/rsif.2022.0072. Epub 2022 Jun 15. J R Soc Interface. 2022. PMID: 35702868 Free PMC article.
RNA secondary structure prediction with convolutional neural networks.
Saman Booy M, Ilin A, Orponen P. Saman Booy M, et al. BMC Bioinformatics. 2022 Feb 2;23(1):58. doi: 10.1186/s12859-021-04540-7. BMC Bioinformatics. 2022. PMID: 35109787 Free PMC article.

See all "Cited by" articles

References

1. Wiedenheft B., Sternberg S. H. & Doudna J. A. RNA-guided genetic silencing systems in bacteria and archaea. Nature 482, 331–338 (2012). 10.1038/nature10886 - DOI - PubMed
1. Reynolds A. et al. Rational siRNA design for RNA interference. Nat. Biotechnol. 22, 326–330 (2004). 10.1038/nbt936 - DOI - PubMed
1. Bonnet É., Rzążewski P. & Sikora F. Designing RNA Secondary Structures is Hard. Research in Computational and Molecular Biology 248 (2017). - PubMed
1. Garcia-Martin J. A., Clote P. & Dotu I. RNAiFOLD: a constraint programming algorithm for RNA inverse folding and molecular design. J. Bioinform. Comput. Biol. 11, 1350001 (2013). 10.1142/S0219720013500017 - DOI - PubMed
1. Taneda A. MODENA: a multi-objective RNA inverse folding. Adv. Appl. Bioinform. Chem. 4, 1–12 (2011). - PMC - PubMed

Publication types

Actions
Actions

MeSH terms

Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

EternaBrain: Automated RNA design through move sets and strategies from an Internet-scale RNA videogame

Affiliations

EternaBrain: Automated RNA design through move sets and strategies from an Internet-scale RNA videogame

Authors

Affiliations

Abstract

Conflict of interest statement

Figures

Similar articles

Cited by

References

Publication types

MeSH terms

Substances

Grants and funding

LinkOut - more resources

Full Text Sources

Miscellaneous

Abstract

Conflict of interest statement

Figures

Similar articles

Cited by

References

Publication types

MeSH terms

Substances

Related information

Grants and funding

LinkOut - more resources

Full Text Sources

Miscellaneous