Molecular-level similarity search brings computing to DNA data storage
- PMID: 34362913
- PMCID: PMC8346626
- DOI: 10.1038/s41467-021-24991-z
Molecular-level similarity search brings computing to DNA data storage
Abstract
As global demand for digital storage capacity grows, storage technologies based on synthetic DNA have emerged as a dense and durable alternative to traditional media. Existing approaches leverage robust error correcting codes and precise molecular mechanisms to reliably retrieve specific files from large databases. Typically, files are retrieved using a pre-specified key, analogous to a filename. However, these approaches lack the ability to perform more complex computations over the stored data, such as similarity search: e.g., finding images that look similar to an image of interest without prior knowledge of their file names. Here we demonstrate a technique for executing similarity search over a DNA-based database of 1.6 million images. Queries are implemented as hybridization probes, and a key step in our approach was to learn an image-to-sequence encoding ensuring that queries preferentially bind to targets representing visually similar images. Experimental results show that our molecular implementation performs comparably to state-of-the-art in silico algorithms for similarity search.
© 2021. The Author(s).
Conflict of interest statement
The following authors C.B., Y.C., G.S., K.S., and L.C. declare the following competing interests: they filed a patent application on the core idea. The patent applicant is University of Washington, the application number is PCT/US2020/027545, application is being evaluated, and it covers the system and method for similarity search, including the encoding method. The following authors: K.S. and Y.C. declare the following competing interests: employed by Microsoft. The remaining authors declare no competing interests.
Figures
References
-
- Adleman LM. Computing with DNA. Sci. Am. 1998;279:54–61. doi: 10.1038/scientificamerican0898-54. - DOI
Publication types
MeSH terms
Substances
LinkOut - more resources
Full Text Sources
Other Literature Sources
