Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2014:1151:165-88.
doi: 10.1007/978-1-4939-0554-6_12.

Identification of mutations in laboratory-evolved microbes from next-generation sequencing data using breseq

Affiliations

Identification of mutations in laboratory-evolved microbes from next-generation sequencing data using breseq

Daniel E Deatherage et al. Methods Mol Biol. 2014.

Abstract

Next-generation DNA sequencing (NGS) can be used to reconstruct eco-evolutionary population dynamics and to identify the genetic basis of adaptation in laboratory evolution experiments. Here, we describe how to run the open-source breseq computational pipeline to identify and annotate genetic differences found in whole-genome and whole-population NGS data from haploid microbes where a high-quality reference genome is available. These methods can also be used to analyze mutants isolated in genetic screens and to detect unintended mutations that may occur during strain construction and genome editing.

PubMed Disclaimer

Figures

Fig. 1
Fig. 1
Basic breseq command line help.
Fig. 2
Fig. 2
Example of breseq output. The upper panel shows a portion of the summary.html file which displays general information about the read data sets, reference sequence, and run parameters. The lower panel shows part of the main index.html page reporting predicted mutations.
Fig. 3
Fig. 3
Evaluating evidence supporting predicted mutations. Characteristics of high-quality (left column) and low-quality (right column) evidence items that you may encounter in breseq output are shown as discussed in the text.
Fig. 4
Fig. 4
Possible causes of spurious or low-quality read alignment (RA) evidence. As described in the text, mismapping of reads to an incorrect reference genome site or local misalignment of bases in correctly mapped reads containing base errors can degrade accuracy and sensitivity when predicting micro-indels and single nucleotide variants.
Fig. 5
Fig. 5
Evidence supporting complex mutations. In each case schematics of the reference genome and the genome of a sequenced clone are shown. Evidence items that would support the genetic difference between the two genomes are shown above. Relevant graphs of read-depth coverage or read alignments are shown below. See the discussion in the text for more details.
Fig. 6
Fig. 6
Example time course of mutation frequencies in an evolving population. A portion of the comparison file generated from the results of analyzing several whole-population samples is shown. Each column (e.g. 2K) is for a sample from a different time point (e.g., 2000 generations).

References

    1. Mardis ER. Next-generation DNA sequencing methods. Annu Rev Genom Human Genet. 2008;9:387–402. - PubMed
    1. Eid J, Fehr A, Gray J, et al. Real-time DNA sequencing from single polymerase molecules. Science. 2009;323:133–138. - PubMed
    1. Trapnell C, Salzberg SL. How to map billions of short reads onto genomes. Nat Biotechnol. 2009;27:455–457. - PMC - PubMed
    1. DePristo MA, Banks E, Poplin R, et al. A framework for variation discovery and genotyping using next-generation DNA sequencing data. Nat Genet. 2011;43:491–498. - PMC - PubMed
    1. Kim D, Salzberg SL. TopHat-Fusion: an algorithm for discovery of novel fusion transcripts. Genome Biol. 2011;12:R72. - PMC - PubMed

Publication types

MeSH terms