The design and analysis of transposon insertion sequencing experiments

Michael C Chao¹, Sören Abel², Brigid M Davis¹, Matthew K Waldor¹

Affiliations

¹ Department of Microbiology and Immunobiology, Harvard Medical School, Boston, Massachusetts 02115, USA; the Division of Infectious Disease, Brigham and Women's Hospital, Boston, Massachusetts 02115, USA; and the Howard Hughes Medical Institute, Boston, Massachusetts 02115, USA.
² Department of Pharmacy, University of Tromsø, The Arctic University of Norway, 9019 Tromsø, Norway.

PMID: 26775926
PMCID: PMC5099075
DOI: 10.1038/nrmicro.2015.7

Review

The design and analysis of transposon insertion sequencing experiments

Michael C Chao et al. Nat Rev Microbiol. 2016 Feb.

. 2016 Feb;14(2):119-28.

doi: 10.1038/nrmicro.2015.7.

Authors

Michael C Chao¹, Sören Abel², Brigid M Davis¹, Matthew K Waldor¹

Affiliations

¹ Department of Microbiology and Immunobiology, Harvard Medical School, Boston, Massachusetts 02115, USA; the Division of Infectious Disease, Brigham and Women's Hospital, Boston, Massachusetts 02115, USA; and the Howard Hughes Medical Institute, Boston, Massachusetts 02115, USA.
² Department of Pharmacy, University of Tromsø, The Arctic University of Norway, 9019 Tromsø, Norway.

PMID: 26775926
PMCID: PMC5099075
DOI: 10.1038/nrmicro.2015.7

Abstract

Transposon insertion sequencing (TIS) is a powerful approach that can be extensively applied to the genome-wide definition of loci that are required for bacterial growth under diverse conditions. However, experimental design choices and stochastic biological processes can heavily influence the results of TIS experiments and affect downstream statistical analysis. In this Opinion article, we discuss TIS experimental parameters and how these factors relate to the benefits and limitations of the various statistical frameworks that can be applied to the computational analysis of TIS data.

PubMed Disclaimer

Figures

**Figure 1. Transposon insertion sequencing (TIS) workflow**
A. A high-density transposon insertion library containing multiple insertions in every non-essential genomic locus is created and then grown under different conditions (e.g. selective and non-selective), and mutants that are viable under each condition are recovered. The transposon junctions in the selected and non-selected pools are attached to sequencing adaptors and amplified, then subjected to high throughput sequencing. B. The sequences are mapped to the genome and the read counts at each insertion site are subjected to statistical analysis to define genomic loci that appear significantly underrepresented in the selective growth condition (conditionally essential loci, blue box). The insertions in the non-selected library (Condition A) can also be statistically evaluated to define essential loci (orange box) that are required for growth under optimal conditions. In highly saturated libraries (orange graph), non-essential and essential genes are easily distinguishable; conversely, many non-essential genes are likely not to have been disrupted by chance in less saturated libraries (purple graph).

**Figure 2. Experimental considerations for TIS experiments**
A. *V. cholerae* TIS DNA libraries were constructed either using an adaptor ligation or arbitrary PCR protocol. Plotting of sequenced reads from multiple libraries reveals a substantially lower correlation coefficient when an arbitrary PCR protocol is used. B. Plotting of the average read count per insertion in all non-essential genes of a *V. cholerae* TIS library (right peak from Figure 1, red graph) shows that read abundance has a non-normal distribution with a positive skew. C. Pools of reads were sampled from a sequenced *M. tuberculosis* TIS library that was grown *in vitro* (input) or recovered from a mouse infection (mouse output), and the number of unique transposon insertions detected in each pool was then plotted. At 500,000 reads (dotted line), >90% of all unique insertions were detected for each dataset, indicative of near-saturation sequencing. Figure adapted from Ref . D. Due to ongoing DNA replication, insertion sites (triangles) adjacent to the origin from a *V. cholerae* TIS dataset have higher read counts than those near the terminus. Positional correction removes this bias (right). Figure adapted from Ref . The red arrows indicate the direction of chromosome replication. E. Annotation-dependent essential loci analysis methods compile all reads or insertion sites disrupted for a gene and determine whether fewer reads or insertion sites are represented compared to the genome-wide distribution. F. Sliding window analysis is one form of annotation-independent analysis. It compares reads in subgenic windows instead of genes, enabling the definition of domain essential genes. Hidden Markov-model (HMM)-based methods, which are also annotation-independent, predict essentiality/non-essentiality for individual insertion sites, which can subsequently form the basis for gene-level or subgenic predictions. *Differences in the distribution of reads (blue or orange bars) across windows due to experimental sources of noise may alter significance cutoffs (dashed lines) and consequently the classification of genes.

See this image and copyright information in PMC

References

1. Barquist L, Boinett CJ, Cain AK. Approaches to querying bacterial genomes with transposon-insertion sequencing. RNA Biol. 2013;10 - PMC - PubMed
1. van Opijnen T, Camilli A. Transposon insertion sequencing: a new tool for systems-level analysis of microorganisms. Nat Rev Microbiol. 2013 - PMC - PubMed
1. van Opijnen T, Bodi KL, Camilli A. Tn-seq: high-throughput parallel sequencing for fitness and genetic interaction studies in microorganisms. Nat Methods. 2009;6:767–72. - PMC - PubMed
1. Goodman AL, et al. Identifying genetic determinants needed to establish a human gut symbiont in its habitat. Cell Host Microbe. 2009;6:279–89. - PMC - PubMed
1. Gawronski JD, Wong SM, Giannoukos G, Ward DV, Akerley BJ. Tracking insertion mutants within libraries by deep sequencing and a genome-wide screen for Haemophilus genes required in the lung. Proc Natl Acad Sci U S A. 2009;106:16422–7. - PMC - PubMed

Publication types

Actions
Actions
Actions

MeSH terms

Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions

Substances

Actions
Actions

Grants and funding

LinkOut - more resources

Full Text Sources
Other Literature Sources
- The Lens - Patent Citations Database
- scite Smart Citations

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

The design and analysis of transposon insertion sequencing experiments

Affiliations

The design and analysis of transposon insertion sequencing experiments

Authors

Affiliations

Abstract

Figures

References

Publication types

MeSH terms

Substances

Grants and funding

LinkOut - more resources

Full Text Sources

Other Literature Sources