Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2023 Jul 16:21:3615-3626.
doi: 10.1016/j.csbj.2023.07.011. eCollection 2023.

Reverse engineering DNA origami nanostructure designs from raw scaffold and staple sequence lists

Affiliations

Reverse engineering DNA origami nanostructure designs from raw scaffold and staple sequence lists

Ben Shirt-Ediss et al. Comput Struct Biotechnol J. .

Abstract

Designs for scaffolded DNA origami nanostructures are commonly and minimally published as the list of DNA staple and scaffold sequences required. In nearly all cases, high-level editable design files (e.g. caDNAno) which generated the low-level sequences are not made available. This de facto 'raw sequence' exchange format allows published origami designs to be re-attempted in the laboratory by other groups, but effectively stops designs from being significantly modified or re-purposed for new future applications. To make the raw sequence exchange format more accessible to further design and engineering, in this work we propose the first algorithmic solution to the inverse problem of converting staple/scaffold sequences back to a 'guide schematic' resembling the original origami schematic. The guide schematic can be used to aid the manual re-input of an origami into a CAD tool like caDNAno, hence recovering a high-level editable design file. Creation of a guide schematic can also be used to double check that a list of staple strand sequences does not have errors and indeed does assemble into a desired origami nanostructure prior to costly laboratory experimentation. We tested our reverse algorithm on 36 diverse origami designs from the literature and found that 29 origamis (81 %) had a good quality guide schematic recovered from raw sequences. Our software is made available at https://revnano.readthedocs.io.

Keywords: Constraint programming; Contact map; DNA nanotechnology; DNA origami; Reverse engineering; Spring embedder.

PubMed Disclaimer

Conflict of interest statement

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Figures

ga1
Reverse engineering unstructured scaffold and staple sequences back into a DNA origami schematic.
Fig. 1
Fig. 1
Forward Design and Reverse Engineering of DNA Origami Nanostructures.(a) In the traditional forward design process, a DNA origami schematic has a scaffold sequence assigned from which the Watson-Crick complementary staple strands are derived. The geometric origami schematic implicitly embeds a topological contact map detailing how the staple and scaffold bases are pairwise hybridised. The process of going from schematic to raw sequences involves two stages of information loss, I1 and I2. (b) The inverse problem addressed in this work, i.e. going from raw sequences back to a geometric origami schematic, involves the (partial) recovery of I2 and then I1. The REVNANO constraint programming solver first reconstructs an approximate contact map of scaffold-staple base pair connectivity r; the latter is converted into an equivalent graph representation of the origami domain-level connectivity D; finally a spring-embedder algorithm converts the graph into an approximate non-crossing geometric representation in 2D or 3D using spring energy minimisation.
Fig. 2
Fig. 2
REVNANO Constraint Programming Solver: Principles Illustrated with Rothemund Smiley Origami. (a) Stage 0: Staple routing trees (SRTs) are constructed for all staples. Tree node (1236: 8) signifies a staple section that begins at scaffold base 1236 and runs for 8 bases in total toward scaffold 5′. Staple 176 has 5 viable routes across the scaffold; Staple 25 has 1 single viable route and casts a larger ‘footprint’ (inset boxes, red starred bases). (b) Definite 1-route staples are identified and placed on the origami as initial hard constraints. (c) Stage 1: SRTs are pruned by the footprints of other SRTs in an iterative process. Over successive iterations more staples (green) collapse to a single defined route. The initial hard constraint staples placed at Stage 0 are shown in black. (d) Stage 2: ‘Problem’ staples remaining with > 1 route are placed by forcing the staple with the clearest shortest path through the current origami mesh to its shortest path route, and repeating over and over. Orange = staple forced to single route on iteration; Green = ‘ripple effect’ staples collapsing to a single defined route because of the latter action; Purple = total problem multi-route staples placed up to the current iteration. (e) Stage 3: Staple-staple overlaps (at locations shown by red dots) are fixed. (f) A few minor defects remain (see Results for discussion). For the REVNANO parameters used in this example (μmin = 6 bp, σ = 5 bp, β = 0.5), 240 staples are placed approximately correctly, 1 staple is placed incorrectly and 2 staples are omitted. Note that in all diagrams, staples are superimposed on top of a pre-existing smiley scaffold routing to clearly show how staples have been placed: however, REVNANO does not know this scaffold routing a priori.
Fig. 3
Fig. 3
REVNANO Solver Parameter Sensitivity. Heatmap shows regions of the REVNANO parameter space (μmin=6bp,σ,β) where reverse engineering of the Rothemund Smiley contact map is most effective in terms of total number of staples placed. REVNANO running in deterministic staple placement mode. Smiley faces 1 and 2 (right) show how staple placement differs at (σ = 10 bp, β = 0.05) and (σ = 0 bp, β = 0.6) parameter points, respectively. White space at the top of the heatmap represents parameter points terminating in a REVNANO error condition: E2 = Not enough staples placed to perform reliable shortest-path calculations; E4 = Unresolvable 3-staple overlap exists. See Supplementary Note 7 for (μmin=5bp,σ,β) case, and REVNANO run times for each parameter combination.
Fig. 4
Fig. 4
Example Origami Guide Schematics Reverse Engineered from Raw Scaffold and Staple Sequence Lists. (a) Original origami schematics. Raw sequences derived from these schematics were passed through the reverse engineering pipeline (Fig. 1b) using REVNANO with optimal parameter settings, to produce the following reconstructed guide schematics: (b) 3D Ball in staples view, 100 % staples placed; (c) 3D Dodecahedron in scaffold routing view, 96.6 % staples placed (purple arrows highlight regions where staples omitted); (d) 2D Lotus Mesh in staples view, 97.8 % staples placed; (e) 2D Rothemund Star in staples view, 99.6 % staples placed; (f) 2D Hexagonal Tile in staples view, 98 % staples placed; (g) 2D Pentagon in staples view, 100 % staples placed; (h) 2D Annulus Mesh 2 in staples view, 100 % staples placed; (i) 2D Fivewell Plate, 100 % staples placed and shown in sequence-ambiguous junctions view computed by the AMBIG algorithm. Immoveable junctions in green, moveable junctions in red. When using the Fivewell Plate guide schematic to re-enter the design into an origami CAD tool, symmetry can be used to guess the correct positions of the ambiguous red crossovers by looking at the positions of the unambiguous green crossovers. Guide schematics for all other origamis in the test set of Table 1 are given in Supplementary Note 8.

References

    1. Dey S., Fan C., Gothelf K.V., Li J., Lin C., Liu L., Liu N., Nijenhuis M.A.D., Saccà B., Simmel F.C., Yan H., Zhan P. DNA origami. Nat Rev Methods Prim. 2021;1:13.
    1. Hong F., Zhang F., Liu Y., Yan H. DNA origami: scaffolds for creating higher order structures. Chem Rev. 2017;117:12584–12640. - PubMed
    1. Raveendran M., Lee A.J., Sharma R., Wälti C., Actis P. Rational design of DNA nanostructures for single molecule biosensing. Nat Commun. 2020;11:4384. - PMC - PubMed
    1. Keller A., Linko V. Challenges and perspectives of DNA nanostructures in biomedicine. Angew Chem Int Ed. 2020;59:15818–15833. - PMC - PubMed
    1. Huang J., Gambietz S., Saccà B. Self-assembled artificial DNA nanocompartments and their bioapplications. Small. 2022;2 - PubMed

LinkOut - more resources