Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2018 Oct 26;9(1):4451.
doi: 10.1038/s41467-018-06926-3.

Multifunctional sequence-defined macromolecules for chemical data storage

Affiliations

Multifunctional sequence-defined macromolecules for chemical data storage

Steven Martens et al. Nat Commun. .

Abstract

Sequence-defined macromolecules consist of a defined chain length (single mass), end-groups, composition and topology and prove promising in application fields such as anti-counterfeiting, biological mimicking and data storage. Here we show the potential use of multifunctional sequence-defined macromolecules as a storage medium. As a proof-of-principle, we describe how short text fragments (human-readable data) and QR codes (machine-readable data) are encoded as a collection of oligomers and how the original data can be reconstructed. The amide-urethane containing oligomers are generated using an automated protecting-group free, two-step iterative protocol based on thiolactone chemistry. Tandem mass spectrometry techniques have been explored to provide detailed analysis of the oligomer sequences. We have developed the generic software tools Chemcoder for encoding/decoding binary data as a collection of multifunctional macromolecules and Chemreader for reconstructing oligomer sequences from mass spectra to automate the process of chemical writing and reading.

PubMed Disclaimer

Conflict of interest statement

The authors declare no competing interests.

Figures

Fig. 1
Fig. 1
Schematic representation of the QR code conversion. A QR code including a benzene structure, encoding the URL of the Wikipedia page of August Kekulé, who was the first to understand the structure of benzene and made a proposal for its structure, is translated to and written on 71 different sequences. Translation is done using the Chemcoder algorithm. The sequences are read afterwards by means of tandem MS and the Chemreader algorithm. Given these sequences, Chemcoder is able to reconstruct the original QR code. Different building blocks in the sequence are highlighted: start fragment (purple box); backbone (yellow boxes), stop fragment (green box) and the functionalities (grey and blue boxes)
Fig. 2
Fig. 2
Determining the sequence order. Tandem mass analysis (MALDI-MS/MS) of a pentamer Z5 with five different functionalities. In blue the read-out is highlighted from right to left, in purple from left to right. The coloured arrows indicate the mass difference between two mass fragments and the functionality that is responsible for this difference.
Fig. 3
Fig. 3
Writing a sentence with sequences. The first two words of the question '1TO 2WRITE 3OR 4NOT 5TO 6WRITE 7ON 8OLIGOS?' in their chemical form. The different functionalities (in blue), introduced via acrylates in the chemical protocol, express a different letter or number
Fig. 4
Fig. 4
Encoding and decoding of the QR code. Encoding scheme (left). The bit string representing the QR code is first translated into a pentadecimal numeral system (base-20). The sequence of ‘flags’ is then cut into smaller pieces. In a final step, the position of each fragment (purple) and the length of the bit string (blue) is added. The last fragment may be filled with a non-coding spacer (black); Decoding scheme (right). After determination of the sequence of all fragments, they are dereplicated, sorted, trimmed and glued together. Finally, the sequence of flags is converted into the bit string that reconstructs the original QR code

Similar articles

Cited by

References

    1. Colquhoun H, Lutz JF. Information-containing macromolecules. Nat. Chem. 2014;6:455–456. doi: 10.1038/nchem.1958. - DOI - PubMed
    1. Goldman N, et al. Towards practical, high-capacity, low-maintenance information storage in synthesized DNA. Nature. 2013;494:77–80. doi: 10.1038/nature11875. - DOI - PMC - PubMed
    1. Zhirnov V, Zadegan RM, Sandhu GS, Church GM, Hughes WL. Nucleic acid memory. Nat. Mater. 2016;15:366–370. doi: 10.1038/nmat4594. - DOI - PMC - PubMed
    1. Bornholt J, et al. A DNA-based archival storage system. SIGPLAN Not. 2016;51:637–649. doi: 10.1145/2954679.2872397. - DOI
    1. Chen YJ, et al. Programmable chemical controllers made from DNA. Nat. Nanotechnol. 2013;8:755–762. doi: 10.1038/nnano.2013.189. - DOI - PMC - PubMed

Publication types