Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2015 Mar 1;31(5):770-2.
doi: 10.1093/bioinformatics/btu719. Epub 2014 Oct 30.

deML: robust demultiplexing of Illumina sequences using a likelihood-based approach

Affiliations

deML: robust demultiplexing of Illumina sequences using a likelihood-based approach

Gabriel Renaud et al. Bioinformatics. .

Abstract

Motivation: Pooling multiple samples increases the efficiency and lowers the cost of DNA sequencing. One approach to multiplexing is to use short DNA indices to uniquely identify each sample. After sequencing, reads must be assigned in silico to the sample of origin, a process referred to as demultiplexing. Demultiplexing software typically identifies the sample of origin using a fixed number of mismatches between the read index and a reference index set. This approach may fail or misassign reads when the sequencing quality of the indices is poor.

Results: We introduce deML, a maximum likelihood algorithm that demultiplexes Illumina sequences. deML computes the likelihood of an observed index sequence being derived from a specified sample. A quality score which reflects the probability of the assignment being correct is generated for each read. Using these quality scores, even very problematic datasets can be demultiplexed and an error threshold can be set.

Availability and implementation: deML is freely available for use under the GPL (http://bioinf.eva.mpg.de/deml/).

PubMed Disclaimer

Figures

Fig. 1.
Fig. 1.
Correlation between the Z1 score for reads aligned to the PhiX genome and the observed misassignment rate. Error bars were obtained using Wilson score intervals

References

    1. Costea P.I., et al. . (2013) Taggd: fast and accurate software for DNA tag generation and demultiplexing. PLoS One, 8, e57521. - PMC - PubMed
    1. Davis M., et al. . (2013) Kraken: a set of tools for quality control and analysis of high-throughput sequence data. Methods, 63, 41–49. - PMC - PubMed
    1. Dodt M., et al. . (2012) Flexbar-flexible barcode and adapter processing for next-generation sequencing platforms. Biology, 1, 895–905. - PMC - PubMed
    1. Kircher M., et al. . (2012) Double indexing overcomes inaccuracies in multiplex sequencing on the illumina platform. Nucleic Acids Res., 40, e3. - PMC - PubMed
    1. Reid J.G., et al. . (2014) Launching genomics into the cloud: deployment of mercury, a next generation sequence analysis pipeline. BMC Bioinformatics, 15, 30. - PMC - PubMed

Publication types