. 2015 Sep 3:16:280.

doi: 10.1186/s12859-015-0706-x.

Multi-objective optimization for RNA design with multiple target secondary structures

Akito Taneda¹

Affiliations

PMID: 26335276
PMCID: PMC4559319
DOI: 10.1186/s12859-015-0706-x

Multi-objective optimization for RNA design with multiple target secondary structures

Akito Taneda. BMC Bioinformatics. 2015.

. 2015 Sep 3:16:280.

doi: 10.1186/s12859-015-0706-x.

Author

Akito Taneda¹

Affiliation

¹ Graduate School of Science and Technology, Hirosaki University, 3 Bunkyo-cho, Hirosaki, Aomori, Japan. taneda@eit.hirosaki-u.ac.jp.

PMID: 26335276
PMCID: PMC4559319
DOI: 10.1186/s12859-015-0706-x

Abstract

Background: RNAs are attractive molecules as the biological parts for synthetic biology. In particular, the ability of conformational changes, which can be encoded in designer RNAs, enables us to create multistable molecular switches that function in biological circuits. Although various algorithms for designing such RNA switches have been proposed, the previous algorithms optimize the RNA sequences against the weighted sum of objective functions, where empirical weights among objective functions are used. In addition, an RNA design algorithm for multiple pseudoknot targets is currently not available.

Results: We developed a novel computational tool for automatically designing RNA sequences which fold into multiple target secondary structures. Our algorithm designs RNA sequences based on multi-objective genetic algorithm, by which we can explore the RNA sequences having good objective function values without empirical weight parameters among the objective functions. Our algorithm has great flexibility by virtue of this weight-free nature. We benchmarked our multi-target RNA design algorithm with the datasets of two, three, and four target structures and found that our algorithm shows better or comparable design performances compared with the previous algorithms, RNAdesign and Frnakenstein. In addition to the benchmarks with pseudoknot-free datasets, we benchmarked MODENA with two-target pseudoknot datasets and found that MODENA can design the RNAs which have the target pseudoknotted secondary structures whose free energies are close to the lowest free energy. Moreover, we applied our algorithm to a ribozyme-based ON-switch which takes a ribozyme-inactive secondary structure when the theophylline aptamer structure is assumed.

Conclusions: Currently, MODENA is the only RNA design software which can be applied to multiple pseudoknot targets. Successful design results for the multiple targets and an RNA device indicate usefulness of our multi-objective RNA design algorithm.

PubMed Disclaimer

Figures

**Fig. 1**
Pseudocode for the genetic algorithm utilized in MODENA for multiple targets l is a loop counter; c is a counter for a stop condition; P is a set of individuals; n _rank1, m _rank1 and N _del are temporary variables

**Fig. 2**
An example of a connected component of the dependency graph. a An example of the set of three targets T1, T2, and T3. Nucleotide positions are numbered above the example, e.g, shadowed columns from the left to the right correspond to nucleotide positions 1, 9, 13, 21, 25, 33, 37, 42, respectively. As an example, base paired positions belonging to a connected component are shadowed. The dependency graph is composed of all connected components (not shown except the shadowed one) derived from the targets. b A graph representation of the connected component shadowed in (a). The numbers indicate nucleotide positions. An edge corresponds to a base pair between nucleotide positions. If there is a base pair in one of the target structures, an edge appears in a connected component. E.g., nucleotide positions 21 and 25 in a have a base pair in T3, so that there is an edge between node 21 and 25 in (b). c Decomposition of the connected component into the paths, where a cycle is treated as a path by defining a single vertex as a ‘start and end’ vertex (e.g. the cycle shown at the left of this figure). The start and end vertices are denoted by solid circles

**Fig. 3**
An example of the procedure for assigning nucleotide codes to the ‘start and end’ vertices of the connected component shown in Fig. 2. In this example, position 21 is selected as a root vertex v _arb. Then we consider the tree structure for assigning nucleotides to the ‘start and end’ vertices, where the order of the ‘start and end’ vertices is obtained by the depth-first search in the spanning tree of the connected component (in this example, the order of positions 21, 25, 37, and 42 from the top to the bottom of the tree). Dashed arrows mean that there exists a decomposed path between two positions. In this example, first, an A or G is randomly selected for position 21 (it is noted that, in the case of the nucleotide assignment for the GA initialization, not an A or G, but an A, C, G or U is randomly selected here). Then nucleotide codes of the remaining positions are assigned from the top of the tree to the bottom. Let us consider an A is selected for position 21. Even in the case such that a C alone is allowed to position 37 due to a sequence constraint, first we try to assign AUU from the top to the bottom; however, since this violates the constraint, we backtrack to position 25 and then assign a C to position 37. As a result, we assign AUCA to the ‘start and end’ vertices in this example

**Fig. 4**
Pseudocode of the nucleotide assignment algorithm for a given connected component c. By using this function, we can assign compatible nucleotide codes to the nucleotide positions belonging to c; the output is contained in a one-dimensional array sq. *maxcount* and b are a counter variable and a temporary one dimensional array for traversing the ‘start and end’ vertices, respectively. In line 6, this function calls traverseStEndVertices() shown in Fig. 5. In line 11, k = 1 and k = length(p) correspond to the start and end vertices, respectively; length(p) is the number of nucleotide positions (including the start and end vertices) of which path p is comprised. In line 13, we assume that χ(k,x,s) can be accessed as a global array. Array indices start at 0. Comments are written in the C language-like format

**Fig. 5**
Pseudocode of the nucleotide assignment algorithm (continued from Fig. 4). i is a nucleotide position. sq is a one-dimensional array for containing a nucleotide sequence. b is a one-dimensional array containing nucleotide positions of a connected component. *lab* = 0, *lab* = 1, and *lab* = 2 indicate “nucleotide position i is v _arb”, “A and U have a priority”, and “C and G have a priority”, respectively. x and *labx* are temporary variables. In lines 8 and 14, a nucleotide type (i.e. purine or pyrimidine) can be determined based on the nucleotide type of v _arb (which is assigned in line 23 - 29) and the partition, to which position i belongs, of the bipartite graph of the connected component c. In line 22, we assume that the indicator function λ(t,u,i,j) is used to check the compatibility as a global array. In line 31, this function calls itself recursively. By resricting possible nucleotide types to purine or pyrimidine in line 7, this function becomes that for transversion operator. Array indices start at 0. Comments are written in the C language-like format

**Fig. 6**
An example of negative design operator. Here we consider a two-target design. In this example, a parent sequence, target structures 1 and 2, a selected predicted structure, and a child sequence are denoted by Parent, Target1, Target2, Predict1, Child, respectively. First, the Parent sequence is copied to the Child sequence; Step 1) we detect undesired base pairs (positions are denoted in red) in Predict1 which do not appear in the target structures; Step 2) we change the nucleotide(s) of each undesired predicted base pair to disrupt the undesired base pairing. Here, the G in Parent is changed to the C in Child (both are denoted by shadowed characters); Step 3) to repair the base complementarity denoted in red, the shadowed C is changed to the shadowed G

**Fig. 7**
An example of positive design operator. Here we consider a two-target design. In this example, a parent sequence, target structures 1 and 2, a selected predicted structure, and a child sequence are denoted by Parent, Target1, Target2, Predict1, Child, respectively. First, the Parent sequence is copied to the Child sequence; Step 1) we scan Predict1 from the left to the right to find a target base pair which is missing in Predict1 (denoted by a red base pair); Step 2) we change the shadowed A in Parent to the shadowed G in Child. Then, to repair the base complementarity, the shadowed U is changed to the shadowed C; Step 3) in addition, since the changed position forms a base pair in Target2, the shadowed A is changed to the shadowed G

**Fig. 8**
A design example of ribozyme-based RNA device. Gray circles indicate sequence constraints. The original ribozyme consensus sequence motifs and loop nucleotides of L2bulge1 [15] were used as sequence constraints. The structures and free energies shown in this figure were computed by Fold in RNAstructure package. a Ribozyme-active conformation predicted as the MFE structure. b Ribozyme-inactive conformation predicted as the MFE structure when the aptamer domain structure is constrained; the region ranging from 43 to 72 nt is the theophylline aptamer domain. The designed sequence is GCUGUCUCUCUCUGUGCUUGAGGGACUGAUGAGAGUGUACCAAUACCAGCAUCGUCUUGAUGCCCUUGGCAGU GGUAUGGUGAAUUCGAAACAGC. These structures were visualized by VARNA [40]

See this image and copyright information in PMC

Cited by

RNAblueprint: flexible multiple target nucleic acid sequence design.
Hammer S, Tschiatschek B, Flamm C, Hofacker IL, Findeiß S. Hammer S, et al. Bioinformatics. 2017 Sep 15;33(18):2850-2858. doi: 10.1093/bioinformatics/btx263. Bioinformatics. 2017. PMID: 28449031 Free PMC article.
A Web Server for Designing Molecular Switches Composed of Two Interacting RNAs.
Taneda A, Sato K. Taneda A, et al. Int J Mol Sci. 2021 Mar 8;22(5):2720. doi: 10.3390/ijms22052720. Int J Mol Sci. 2021. PMID: 33800268 Free PMC article.
Infrared: a declarative tree decomposition-powered framework for bioinformatics.
Yao HT, Marchand B, Berkemer SJ, Ponty Y, Will S. Yao HT, et al. Algorithms Mol Biol. 2024 Mar 16;19(1):13. doi: 10.1186/s13015-024-00258-2. Algorithms Mol Biol. 2024. PMID: 38493130 Free PMC article.
Design Principles of Biological Oscillators through Optimization: Forward and Reverse Analysis.
Otero-Muras I, Banga JR. Otero-Muras I, et al. PLoS One. 2016 Dec 15;11(12):e0166867. doi: 10.1371/journal.pone.0166867. eCollection 2016. PLoS One. 2016. PMID: 27977695 Free PMC article.
Automated Design of Diverse Stand-Alone Riboswitches.
Wu MJ, Andreasson JOL, Kladwang W, Greenleaf W, Das R. Wu MJ, et al. ACS Synth Biol. 2019 Aug 16;8(8):1838-1846. doi: 10.1021/acssynbio.9b00142. Epub 2019 Jul 29. ACS Synth Biol. 2019. PMID: 31298841 Free PMC article.

See all "Cited by" articles

References

1. Tang J, Breaker RR. Rational design of allosteric ribozymes. Chem Biol. 1997;4:453–9. doi: 10.1016/S1074-5521(97)90197-6. - DOI - PubMed
1. Win MN, Liang JC, Smolke CD. Frameworks for programming biological function through RNA parts and devices. Chem Biol. 2009;16:298–310. doi: 10.1016/j.chembiol.2009.02.011. - DOI - PMC - PubMed
1. Rodrigo G, Landrain TE, Jaramillo A. De novo automated design of small RNA circuits for engineering synthetic riboregulation in living cells. Proc Natl Acad Sci U S A. 2012;109:15271–6. doi: 10.1073/pnas.1203831109. - DOI - PMC - PubMed
1. Wachsmuth M, Findeiß S, Weissheimer N, Stadler PF, Mörl M. De novo design of a synthetic riboswitch that regulates transcription termination. Nucleic Acids Res. 2012;41:2541–551. doi: 10.1093/nar/gks1330. - DOI - PMC - PubMed
1. Perreault J, Weinberg Z, Roth A, Popescu O, Chartrand P, Ferbeyre G, et al. Identification of hammerhead ribozymes in all domains of life reveals novel structural variations. PLoS Comput Biol. 2011;7:1002031. doi: 10.1371/journal.pcbi.1002031. - DOI - PMC - PubMed

Publication types

Actions

MeSH terms

Actions
Actions
Actions
Actions
Actions
Actions

Substances

Actions

LinkOut - more resources

Full Text Sources
Other Literature Sources
- scite Smart Citations
Miscellaneous
- NCI CPTAC Assay Portal

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Multi-objective optimization for RNA design with multiple target secondary structures

Affiliation

Multi-objective optimization for RNA design with multiple target secondary structures

Author

Affiliation

Abstract

Figures

Similar articles

Cited by

References

Publication types

MeSH terms

Substances

LinkOut - more resources

Full Text Sources

Other Literature Sources

Miscellaneous