Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2013:2013:801505.
doi: 10.1155/2013/801505. Epub 2013 Dec 17.

Population-Sequencing as a Biomarker of Burkholderia mallei and Burkholderia pseudomallei Evolution through Microbial Forensic Analysis

Affiliations

Population-Sequencing as a Biomarker of Burkholderia mallei and Burkholderia pseudomallei Evolution through Microbial Forensic Analysis

John P Jakupciak et al. J Nucleic Acids. 2013.

Abstract

Large-scale genomics projects are identifying biomarkers to detect human disease. B. pseudomallei and B. mallei are two closely related select agents that cause melioidosis and glanders. Accurate characterization of metagenomic samples is dependent on accurate measurements of genetic variation between isolates with resolution down to strain level. Often single biomarker sensitivity is augmented by use of multiple or panels of biomarkers. In parallel with single biomarker validation, advances in DNA sequencing enable analysis of entire genomes in a single run: population-sequencing. Potentially, direct sequencing could be used to analyze an entire genome to serve as the biomarker for genome identification. However, genome variation and population diversity complicate use of direct sequencing, as well as differences caused by sample preparation protocols including sequencing artifacts and mistakes. As part of a Department of Homeland Security program in bacterial forensics, we examined how to implement whole genome sequencing (WGS) analysis as a judicially defensible forensic method for attributing microbial sample relatedness; and also to determine the strengths and limitations of whole genome sequence analysis in a forensics context. Herein, we demonstrate use of sequencing to provide genetic characterization of populations: direct sequencing of populations.

PubMed Disclaimer

Figures

Figure 1
Figure 1
A schematic diagram of the experimental design. Theoretical accumulation of mutational variations among the 12 bacterial culture lineages.
Figure 2
Figure 2
Accumulated genomic diversity expected from different passaging approaches. (a) Imposing a single cell genetic bottleneck at each passage step causes a gradual mutational shift with all descendent cells being closely related to one another. (b) By passaging a random subset of microbes at each step, accumulated mutational diversity within the lineage population is expected to be much greater.
Figure 3
Figure 3
Pathogen culturing protocol in selective media. Following the seventh passage step, six (6) colonies from each of the twelve cultures were selected, amplified in liquid media, and the DNA was isolated from each for a total of 72 DNA isolations per strain tested. A frozen archive sample of each clone selected for DNA isolation will also be maintained for potential future analysis.
Figure 4
Figure 4
Five clones of the same lineage after passage 8 were sequenced and compared for SNPs. Posterior probabilities were calculated by the program SOAPsnp. This includes SNPs detected against the progenitor culture that was sequenced right after the first passage. (a) Illustrates chromosome 1. (b) Illustrates chromosome 2.
Figure 5
Figure 5
Data Analysis Pipeline. SOAPsnp was used to find SNPs in the data. The criteria for SNP validation in SOAPsnp is rather low. Variant validation is highly critical in metagenomic samples, more so than with homogeneous samples. False variants created by sequencer error can quickly change the results of forensics analysis in metagenomic samples where lower coverage depth and partial base consensus conditions are expected, whereas base consensus can be demanded in homogeneous sample data sets.

References

    1. Trapnell C, Salzberg SL. How to map billions of short reads onto genomes. Nature Biotechnology. 2009;27(5):455–457. - PMC - PubMed
    1. Nagasaki H, Mochizuki T, Kodama Y, et al. DDBJ read annotation pipeline: a cloud computing-based pipeline for high-throughput analysis of next-generation sequencing data. DNA Research. 2013;20(4):383–390. - PMC - PubMed
    1. Camerlengo T, Ozer HG, Onti-Srinivasan R, et al. From sequencer to supercomputer: an automatic pipeline for managing and processing next generation sequencing data. AMIA Summits on Translational Science Proceedings. 2012;12:1–10. - PMC - PubMed
    1. Pavlopoulos GA, Oulas A, Iacucci E, et al. Unraveling genomic variation from next generation sequencing data. BioData Mining. 2013;6(1, article 13) - PMC - PubMed
    1. Jakupciak JP, Colwell RR. Biological agent detection technologies. Molecular Ecology Resources. 2009;9(supplement 1):51–57. - PMC - PubMed

LinkOut - more resources