Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2013;8(3):e57521.
doi: 10.1371/journal.pone.0057521. Epub 2013 Mar 4.

TagGD: fast and accurate software for DNA Tag generation and demultiplexing

Affiliations

TagGD: fast and accurate software for DNA Tag generation and demultiplexing

Paul Igor Costea et al. PLoS One. 2013.

Abstract

Multiplexing is of vital importance for utilizing the full potential of next generation sequencing technologies. We here report TagGD (DNA-based Tag Generator and Demultiplexor), a fully-customisable, fast and accurate software package that can generate thousands of barcodes satisfying user-defined constraints and can guarantee full demultiplexing accuracy. The barcodes are designed to minimise their interference with the experiment. Insertion, deletion and substitution events are considered when designing and demultiplexing barcodes. 20,000 barcodes of length 18 were designed in 5 minutes and 2 million barcoded Illumina HiSeq-like reads generated with an error rate of 2% were demultiplexed with full accuracy in 5 minutes. We believe that our software meets a central demand in the current high-throughput biology and can be utilised in any field with ample sample abundance. The software is available on GitHub (https://github.com/pelinakan/UBD.git).

PubMed Disclaimer

Conflict of interest statement

Competing Interests: The authors have declared that no competing interests exist.

Figures

Figure 1
Figure 1. The generation of barcodes along with all the filtering steps are performed concurrently with the execution of the main application.
Each barcode that passes filtering is added to a pool and generation of more barcodes continues. This pool is in time drained by the main application where the insertion into the solution set takes place.
Figure 2
Figure 2. Let barcode 1 and 2 are the barcodes within the designed unique barcode set and another barcode containing errors introduced during the experiment, and the edges of the triangles represent the Levenshtein edit distance between them.
A) Barcode 1 can be converted to Barcode 3 with three operations. However two errors introduced either in barcode 1 or 2 can result in a new sequence, which requires same number of operations to transform to either barcode 1 or 2. Therefore, it cannot be classified. B) Barcode 1 is incorrectly synthesised or sequenced in such a way that it now has a smaller edit distance to barcode 2, which leads to its misclassification.

References

    1. Erlich Y, Chang K, Gordon A, Ronen R, Navon O, et al. (2009) DNA Sudoku – harnessing high-throughput sequencing for multiplexed specimen analysis. Genome research 19: 1243–1253. - PMC - PubMed
    1. Patwardhan RP, Hiatt JB, Witten DM, Kim MJ, Smith RP, et al. (2012) Massively parallel functional dissection of mammalian enhancers in vivo. Nature biotechnology 30: 265–270. - PMC - PubMed
    1. Stahl PL, Lundeberg J (2012) Toward the single-hour high-quality genome. Annual review of biochemistry 81: 359–378. - PubMed
    1. Xu Q, Schlabach MR, Hannon GJ, Elledge SJ (2009) Design of 240,000 orthogonal 25mer DNA barcode probes. Proceedings of the National Academy of Sciences of the United States of America 106: 2289–2294. - PMC - PubMed
    1. Hamady M, Walker JJ, Harris JK, Gold NJ, Knight R (2008) Error-correcting barcoded primers for pyrosequencing hundreds of samples in multiplex. Nature methods 5: 235–237. - PMC - PubMed

Publication types