Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2004 Feb 27;32(4):1392-403.
doi: 10.1093/nar/gkh291. Print 2004.

Paradigms for computational nucleic acid design

Affiliations

Paradigms for computational nucleic acid design

Robert M Dirks et al. Nucleic Acids Res. .

Abstract

The design of DNA and RNA sequences is critical for many endeavors, from DNA nanotechnology, to PCR-based applications, to DNA hybridization arrays. Results in the literature rely on a wide variety of design criteria adapted to the particular requirements of each application. Using an extensively studied thermodynamic model, we perform a detailed study of several criteria for designing sequences intended to adopt a target secondary structure. We conclude that superior design methods should explicitly implement both a positive design paradigm (optimize affinity for the target structure) and a negative design paradigm (optimize specificity for the target structure). The commonly used approaches of sequence symmetry minimization and minimum free-energy satisfaction primarily implement negative design and can be strengthened by introducing a positive design component. Surprisingly, our findings hold for a wide range of secondary structures and are robust to modest perturbation of the thermodynamic parameters used for evaluating sequence quality, suggesting the feasibility and ongoing utility of a unified approach to nucleic acid design as parameter sets are refined further. Finally, we observe that designing for thermodynamic stability does not determine folding kinetics, emphasizing the opportunity for extending design criteria to target kinetic features of the energy landscape.

PubMed Disclaimer

Figures

Figure 1
Figure 1
(a) Feedback loop for evaluating nucleic acid sequence designs and methodologies. (b) Positive and negative design paradigms. Two sequences are evaluated using an empirical potential on both the desired target structure and an undesired structure. Using a positive design paradigm, sequence A would be selected since it exhibits a stronger affinity than sequence B for the target structure (i.e. lower ΔG). Using a negative design paradigm, sequence B would be selected since it exhibits specificity for the target structure while sequence A exhibits specificity for the undesired structure. To provide a common basis for comparison, ΔG = 0 for a strand with no base pairs. (c) Canonical loops of nucleic acid secondary structure: hairpin loops, stacked base pairs, a bulge loop, an interior loop and a multiloop. These loop structures are all nested (i.e. there are no crossing arcs in the corresponding polymer graph with the backbone drawn as a straight line). (d) A sample pseudoknot with base pairs a·f and c·h (with a < c) that fail to satisfy the nesting property a < c < h < f, yielding crossing arcs in the corresponding polymer graph.
Figure 2
Figure 2
RNA multiloop. (a) Histograms for 100 sequence designs based on probability of sampling the target graph, p(s∗). The color legend applies to all plots. (b) Histograms for the same 100 sequence designs based on average number of incorrect nucleotides, n(s∗). (c) Base-pairing probabilities Pi,j for the median sequence based on p(s∗). Square sizes correspond to Pi,j ≥ {0.5,0.05,0.005}, respectively. The target structure is identical to that obtained by optimizing probability (black) or the average number of incorrect nucleotides (not shown). (d) p(s∗) versus free energy, ΔG(s∗). Each dot corresponds to one of 100 sequences designed using each method. Each bold square corresponds to the median over the 100 sequences designed using each method. (e) p(s∗) versus median folding time, t(s∗), over 1000 kinetic trajectories starting from random coil initial conditions. Dots and squares are interpreted as in (d).
Figure 3
Figure 3
RNA model perturbation study. For the multiloop designs of Figure 2, the top-ranked sequence for each method based on p(s∗) is re-examined using 1000 randomized potential functions where every parameter is independently adjusted by an amount uniformly distributed on ±10%, ±20% or ±50%. The original probabilities are depicted as dashed lines.
Figure 4
Figure 4
RNA multiloop variations. Design performance based on (a) p(s∗) and (b) n(s∗) with stem α = (4,6,8) and single-stranded multiloop regions β = (0,2,4). Surfaces show the mean values plus and minus one standard deviation for 100 independently designed sequences. The results for optimizing average incorrect nucleotides (not shown) are nearly indistinguishable from those obtained by optimizing probability.
Figure 5
Figure 5
Large RNA multiloop. See caption for Figure 2a–c.
Figure 6
Figure 6
RNA pseudoknot. See caption for Figure 2a–c.

Similar articles

Cited by

References

    1. Seeman N.C. (1982) Nucleic acid junctions and lattices. J. Theor. Biol., 99, 237–247. - PubMed
    1. Seeman N.C. (1999) DNA engineering and its application to nanotechnology. Trends Biotechnol., 17, 437–443. - PubMed
    1. Winfree E., Liu,F., Wenzler,L.A. and Seeman,N.C. (1998) Design and self-assembly of two-dimensional DNA crystals. Nature, 394, 539–544. - PubMed
    1. Kallenbach R.K., Ma,R.-I. and Seeman,N.C. (1983) An immobile nucleic acid junction constructed from oligonucleotides. Nature, 305, 829–831.
    1. Chen J. and Seeman,N.C. (1991) The synthesis from DNAs of a molecule with the connectivity of a cube. Nature, 350, 631–633. - PubMed

Publication types