Characterization and remediation of sample index swaps by non-redundant dual indexing on massively parallel sequencing platforms
- PMID: 29739332
- PMCID: PMC5941783
- DOI: 10.1186/s12864-018-4703-0
Characterization and remediation of sample index swaps by non-redundant dual indexing on massively parallel sequencing platforms
Abstract
Background: Here we present an in-depth characterization of the mechanism of sequencer-induced sample contamination due to the phenomenon of index swapping that impacts Illumina sequencers employing patterned flow cells with Exclusion Amplification (ExAmp) chemistry (HiSeqX, HiSeq4000, and NovaSeq). We also present a remediation method that minimizes the impact of such swaps.
Results: Leveraging data collected over a two-year period, we demonstrate the widespread prevalence of index swapping in patterned flow cell data. We calculate mean swap rates across multiple sample preparation methods and sequencer models, demonstrating that different library methods can have vastly different swapping rates and that even non-ExAmp chemistry instruments display trace levels of index swapping. We provide methods for eliminating sample data cross contamination by utilizing non-redundant dual indexing for complete filtering of index swapped reads, and share the sequences for 96 non-combinatorial dual indexes we have validated across various library preparation methods and sequencer models. Finally, using computational methods we provide a greater insight into the mechanism of index swapping.
Conclusions: Index swapping in pooled libraries is a prevalent phenomenon that we observe at a rate of 0.2 to 6% in all sequencing runs on HiSeqX, HiSeq 4000/3000, and NovaSeq. Utilizing non-redundant dual indexing allows for the removal (flagging/filtering) of these swapped reads and eliminates swapping induced sample contamination, which is critical for sensitive applications such as RNA-seq, single cell, blood biopsy using circulating tumor DNA, or clinical sequencing.
Keywords: Barcodes; Exclusion amplification; ILLUMINA sequencing; Index; Index hopping; Index swapping; Indexes; Massively parallel sequencing; Multiplexing; Next generation sequencing.
Conflict of interest statement
Ethics approval and consent to participate
Only sequencing metric values automatically calculated by the Picard analysis pipeline (% contamination, etc.) and library index read data (% demultiplexed reads, % index swapping, etc.) were examined. For
Competing interests
The authors declare that they have no competing interests.
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Figures





References
-
- Shen MR, Boutell JM, Stephens KM, Ronaghi M, Gunderson K, Venkatesan BM, Bowen MS, Vijayan K. Kinetic exclusion amplification of nucleic acid libraries. USPTO 20160053310:A1. US Patent, filed October 9, 2015, and issued February 25, 2016.
-
- Illumina, Inc . Illumina HiSeqX series specification sheet. 2017.
-
- Illumina, Inc . Illumina NovaSeq specification sheet. 2017.
-
- Sinha R, Stanley G, Gulati GS, Ezran C, Travaglini KJ, Wei E, et al. Index switching causes ‘spreading-of-signal’ among multiplexed samples in illumina HiSeq 4000 DNA sequencing. bioRxiv. 2017; 10.1101/125724.
-
- Illumina, Inc . Effects of index Misassignment on multiplexing and downstream analysis. 2017.
MeSH terms
Substances
Grants and funding
LinkOut - more resources
Full Text Sources
Other Literature Sources