Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2019 Aug 27:7:241.
doi: 10.3389/fpubh.2019.00241. eCollection 2019.

Evaluation of Rapid Library Preparation Protocols for Whole Genome Sequencing Based Outbreak Investigation

Affiliations

Evaluation of Rapid Library Preparation Protocols for Whole Genome Sequencing Based Outbreak Investigation

Helena M B Seth-Smith et al. Front Public Health. .

Abstract

Whole genome sequencing (WGS) has become the new gold standard for bacterial outbreak investigation, due to the high resolution available for typing. While sequencing is currently predominantly performed on Illumina devices, the preceding library preparation can be performed using various protocols. Enzymatic fragmentation library preparation protocols are fast, have minimal hands-on time, and work with small quantities of DNA. The aim of our study was to compare three library preparation protocols for molecular typing: Nextera XT (Illumina); Nextera Flex (Illumina); and QIAseq FX (Qiagen). We selected 12 ATCC strains from human Gram-positive and Gram-negative pathogens with %G+C-content ranging from 27% (Fusobacterium nucleatum) to 73% (Micrococcus luteus), each having a high quality complete genome assembly available, to allow in-depth analysis of the resulting Illumina sequence data quality. Additionally, we selected isolates from previously analyzed cases of vancomycin-resistant Enterococcus faecium (VRE) (n = 7) and a local outbreak of Klebsiella aerogenes (n = 5). The number of protocol steps and time required were compared, in order to test the suitability for routine laboratory work. Data analyses were performed with standard tools commonly used in outbreak situations: Ridom SeqSphere+ for cgMLST; CLC genomics workbench for SNP analysis; and open source programs. Nextera Flex and QIAseq FX were found to be less sensitive than Nextera XT to variable %G+C-content, resulting in an almost uniform distribution of read-depth. Therefore, low coverage regions are reduced to a minimum resulting in a more complete representation of the genome. Thus, with these two protocols, more alleles were detected in the cgMLST analysis, producing a higher resolution of closely related isolates. Furthermore, they result in a more complete representation of accessory genes. In particular, the high data quality and relative simplicity of the workflow of Nextera Flex stood out in this comparison. This thorough comparison within an ISO/IEC 17025 accredited environment will be of interest to those aiming to optimize their clinical microbiological genome sequencing.

Keywords: Illumina; NGS; bacteria; comparison; library; next generation sequencing; prokaryotes; whole genome sequencing.

PubMed Disclaimer

Figures

Figure 1
Figure 1
Quality assessment of WGS data. (A) The reads of the three library kits subsampled to 100-fold were mapped against the 12 reference genomes and the read depth called was measured. The colors indicate the different library preparation kits. The x-axis reflects the position along the genomes and the y-axis the read depth. (B) The insert size of the different libraries was calculated using the alignment of the paired-end reads to the reference. The boxplots represent the calculations from the different species, with the lowest %G+C-content on the left, and the highest on the right. In the boxplots the lower and upper hinges correspond to the first and third quartiles. The whiskers are located at 1.5x of the interquartile range. (C) The base composition of all the nucleotide sites in the reads was determined. The bases on the left side show the composition around the fragmentation site.
Figure 2
Figure 2
Comparison of the sequencing content using k-mers. (A) All k-mers identified within the reads were compared to those k-mer from the reference genomes. The x-axis shows the different subsampling of the reads and the y-axis shows the percent of k-mers that were found in the reads. (B) The assemblies of the sequenced strains were compared against the reference assemblies using the Jaccard index of the k-mers. The x-axis shows the different subsampling of the reads used for each assembly. The y-axis shows the Jaccard index. The colors indicate the different library preparation kits. In the boxplots the lower and upper hinges correspond to the first and third quartiles. The whiskers are located at 1.5x of the interquartile range.
Figure 3
Figure 3
cgMLST alleles identified from the patient isolates. The different subsamples (x-axis) were used to determine of the alleles of the core genome. The different strains are depicted as bars. The y-axis shows the percentage of core genes that can be used for allelic typing. The colors indicate the different library preparation kits. The E. faecium isolates were analyzed using Mentalist (A) and Ridom SeqSphere+ (B). The K. aerogenes isolates were analyzed only using Ridom SeqSphere+ (C). The failed Qia library is labeled with “*”.
Figure 4
Figure 4
Analysis of the K. aerogenes outbreak isolates. (A) The isolates (200-fold subsamples) were analyzed using cgMLST in Ridom SeqSphere+ and are depicted in a minimum spanning tree (MST). The isolates are shown as circles. If two strains are identical they collapse into one circle. The numbers on the lines connecting the different circles show the number of different alleles between two isolates (not to scale). (B) The genomic distances between the isolates (200-fold subsamples) is show as a phylogenetic tree representing all SNP differences across the whole genome. (C) SNP numbers across the tree called using the different subsamples.

Similar articles

Cited by

References

    1. Gardy JL, Johnston JC, Ho Sui SJ, Cook VJ, Shah L, Brodkin E, et al. . Whole-genome sequencing and social-network analysis of a tuberculosis outbreak. N Engl J Med. (2011) 364:730–9. 10.1056/NEJMoa1003176 - DOI - PubMed
    1. Harris SR, Cartwright EJ, Torok ME, Holden MT, Brown NM, Ogilvy-Stuart AL, et al. . Whole-genome sequencing for analysis of an outbreak of meticillin-resistant Staphylococcus aureus: a descriptive study. Lancet Infect Dis. (2013) 13:130–6. 10.1016/s1473-3099(12)70268-2 - DOI - PMC - PubMed
    1. Deurenberg RH, Bathoorn E, Chlebowicz MA, Couto N, Ferdous M, Garcia-Cobos S, et al. . Application of next generation sequencing in clinical microbiology and infection prevention. J Biotechnol. (2017) 243:16–24. 10.1016/j.jbiotec.2016.12.022 - DOI - PubMed
    1. Balloux F, Bronstad Brynildsrud O, van Dorp L, Shaw LP, Chen H, Harris KA, et al. . From theory to practice: translating Whole-Genome Sequencing (WGS) into the clinic. Trends Microbiol. (2018) 26:1035–48. 10.1016/j.tim.2018.08.004 - DOI - PMC - PubMed
    1. Besser J, Carleton HA, Gerner-Smidt P, Lindsey RL, Trees E. Next-generation sequencing technologies and their application to the study and control of bacterial infections. Clin Microbiol Infect. (2018) 24:335–41. 10.1016/j.cmi.2017.10.013 - DOI - PMC - PubMed

LinkOut - more resources