Reading and writing digital data in DNA
- PMID: 31784718
- DOI: 10.1038/s41596-019-0244-5
Reading and writing digital data in DNA
Abstract
Because of its longevity and enormous information density, DNA is considered a promising data storage medium. In this work, we provide instructions for archiving digital information in the form of DNA and for subsequently retrieving it from the DNA. In principle, information can be represented in DNA by simply mapping the digital information to DNA and synthesizing it. However, imperfections in synthesis, sequencing, storage and handling of the DNA induce errors within the molecules, making error-free information storage challenging. The procedure discussed here enables error-free storage by protecting the information using error-correcting codes. Specifically, in this protocol, we provide the technical details and precise instructions for translating digital information to DNA sequences, physically handling the biomolecules, storing them and subsequently re-obtaining the information by sequencing the DNA. Along with the protocol, we provide computer code that automatically encodes digital information to DNA sequences and decodes the information back from DNA to a digital file. The required software is provided on a Github repository. The protocol relies on commercial DNA synthesis and DNA sequencing via Illumina dye sequencing, and requires 1-2 h of preparation time, 1/2 d for sequencing preparation and 2-4 h for data analysis. This protocol focuses on storage scales of ~100 kB to 15 MB, offering an ideal starting point for small experiments. It can be augmented to enable higher data volumes and random access to the data and also allows for future sequencing and synthesis technologies, by changing the parameters of the encoder/decoder to account for the corresponding error rates.
Similar articles
-
Portable and Error-Free DNA-Based Data Storage.Sci Rep. 2017 Jul 10;7(1):5011. doi: 10.1038/s41598-017-05188-1. Sci Rep. 2017. PMID: 28694453 Free PMC article.
-
A Characterization of the DNA Data Storage Channel.Sci Rep. 2019 Jul 4;9(1):9663. doi: 10.1038/s41598-019-45832-6. Sci Rep. 2019. PMID: 31273225 Free PMC article.
-
A highly parallel strategy for storage of digital information in living cells.BMC Biotechnol. 2018 Oct 17;18(1):64. doi: 10.1186/s12896-018-0476-4. BMC Biotechnol. 2018. PMID: 30333005 Free PMC article.
-
In-vitro validated methods for encoding digital data in deoxyribonucleic acid (DNA).BMC Bioinformatics. 2023 Apr 21;24(1):160. doi: 10.1186/s12859-023-05264-6. BMC Bioinformatics. 2023. PMID: 37085766 Free PMC article. Review.
-
Data Storage Using DNA.Adv Mater. 2024 Feb;36(6):e2307499. doi: 10.1002/adma.202307499. Epub 2023 Dec 2. Adv Mater. 2024. PMID: 37800877 Review.
Cited by
-
Overcoming the High Error Rate of Composite DNA Letters-Based Digital Storage through Soft-Decision Decoding.Adv Sci (Weinh). 2024 Aug;11(30):e2402951. doi: 10.1002/advs.202402951. Epub 2024 Jun 14. Adv Sci (Weinh). 2024. PMID: 38874370 Free PMC article.
-
Exploring the intersection of natural sciences and information technology via entropy and randomness.Nat Commun. 2025 Jul 29;16(1):6969. doi: 10.1038/s41467-025-62353-1. Nat Commun. 2025. PMID: 40730610 Free PMC article. Review.
-
Enthalpy and entropy synergistic regulation-based programmable DNA motifs for biosensing and information encryption.Sci Adv. 2023 May 19;9(20):eadf5868. doi: 10.1126/sciadv.adf5868. Epub 2023 May 17. Sci Adv. 2023. PMID: 37196083 Free PMC article.
-
DNA synthesis for true random number generation.Nat Commun. 2020 Nov 18;11(1):5869. doi: 10.1038/s41467-020-19757-y. Nat Commun. 2020. PMID: 33208744 Free PMC article.
-
Emerging Approaches to DNA Data Storage: Challenges and Prospects.ACS Nano. 2022 Nov 22;16(11):17552-17571. doi: 10.1021/acsnano.2c06748. Epub 2022 Oct 18. ACS Nano. 2022. PMID: 36256971 Free PMC article. Review.
References
-
- Valladas, H. et al. Radiocarbon AMS dates for paleolithic cave paintings. Radiocarbon 43, 977–986 (2001). - DOI
-
- Kutschera, W. & Rom, W. Ötzi, the prehistoric Iceman. Nucl. Instr. Methods Phys. Res. 164, 12–22 (2000). - DOI
-
- Keller, A. et al. New insights into the Tyrolean Iceman’s origin and phenotype as inferred by whole-genome sequencing. Nat. Commun. 3, 698 (2012). - DOI
-
- Rutten, M., Vaandrager, F. W., Elemans, J. A. A. W. & Nolte, R. J. M. Encoding information into polymers. Nat. Rev. Chem. 2, 365–381 (2018). - DOI
-
- Neiman, M. S. Some fundamental issues of microminiaturization. Radiotekhnika 2, 3–12 (1964).
Publication types
MeSH terms
Substances
LinkOut - more resources
Full Text Sources
Other Literature Sources
Miscellaneous