Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Review
. 2016 Feb;14(2):119-28.
doi: 10.1038/nrmicro.2015.7.

The design and analysis of transposon insertion sequencing experiments

Affiliations
Review

The design and analysis of transposon insertion sequencing experiments

Michael C Chao et al. Nat Rev Microbiol. 2016 Feb.

Abstract

Transposon insertion sequencing (TIS) is a powerful approach that can be extensively applied to the genome-wide definition of loci that are required for bacterial growth under diverse conditions. However, experimental design choices and stochastic biological processes can heavily influence the results of TIS experiments and affect downstream statistical analysis. In this Opinion article, we discuss TIS experimental parameters and how these factors relate to the benefits and limitations of the various statistical frameworks that can be applied to the computational analysis of TIS data.

PubMed Disclaimer

Figures

Figure 1
Figure 1. Transposon insertion sequencing (TIS) workflow
A. A high-density transposon insertion library containing multiple insertions in every non-essential genomic locus is created and then grown under different conditions (e.g. selective and non-selective), and mutants that are viable under each condition are recovered. The transposon junctions in the selected and non-selected pools are attached to sequencing adaptors and amplified, then subjected to high throughput sequencing. B. The sequences are mapped to the genome and the read counts at each insertion site are subjected to statistical analysis to define genomic loci that appear significantly underrepresented in the selective growth condition (conditionally essential loci, blue box). The insertions in the non-selected library (Condition A) can also be statistically evaluated to define essential loci (orange box) that are required for growth under optimal conditions. In highly saturated libraries (orange graph), non-essential and essential genes are easily distinguishable; conversely, many non-essential genes are likely not to have been disrupted by chance in less saturated libraries (purple graph).
Figure 2
Figure 2. Experimental considerations for TIS experiments
A. V. cholerae TIS DNA libraries were constructed either using an adaptor ligation or arbitrary PCR protocol. Plotting of sequenced reads from multiple libraries reveals a substantially lower correlation coefficient when an arbitrary PCR protocol is used. B. Plotting of the average read count per insertion in all non-essential genes of a V. cholerae TIS library (right peak from Figure 1, red graph) shows that read abundance has a non-normal distribution with a positive skew. C. Pools of reads were sampled from a sequenced M. tuberculosis TIS library that was grown in vitro (input) or recovered from a mouse infection (mouse output), and the number of unique transposon insertions detected in each pool was then plotted. At 500,000 reads (dotted line), >90% of all unique insertions were detected for each dataset, indicative of near-saturation sequencing. Figure adapted from Ref . D. Due to ongoing DNA replication, insertion sites (triangles) adjacent to the origin from a V. cholerae TIS dataset have higher read counts than those near the terminus. Positional correction removes this bias (right). Figure adapted from Ref . The red arrows indicate the direction of chromosome replication. E. Annotation-dependent essential loci analysis methods compile all reads or insertion sites disrupted for a gene and determine whether fewer reads or insertion sites are represented compared to the genome-wide distribution. F. Sliding window analysis is one form of annotation-independent analysis. It compares reads in subgenic windows instead of genes, enabling the definition of domain essential genes. Hidden Markov-model (HMM)-based methods, which are also annotation-independent, predict essentiality/non-essentiality for individual insertion sites, which can subsequently form the basis for gene-level or subgenic predictions. *Differences in the distribution of reads (blue or orange bars) across windows due to experimental sources of noise may alter significance cutoffs (dashed lines) and consequently the classification of genes.

References

    1. Barquist L, Boinett CJ, Cain AK. Approaches to querying bacterial genomes with transposon-insertion sequencing. RNA Biol. 2013;10 - PMC - PubMed
    1. van Opijnen T, Camilli A. Transposon insertion sequencing: a new tool for systems-level analysis of microorganisms. Nat Rev Microbiol. 2013 - PMC - PubMed
    1. van Opijnen T, Bodi KL, Camilli A. Tn-seq: high-throughput parallel sequencing for fitness and genetic interaction studies in microorganisms. Nat Methods. 2009;6:767–72. - PMC - PubMed
    1. Goodman AL, et al. Identifying genetic determinants needed to establish a human gut symbiont in its habitat. Cell Host Microbe. 2009;6:279–89. - PMC - PubMed
    1. Gawronski JD, Wong SM, Giannoukos G, Ward DV, Akerley BJ. Tracking insertion mutants within libraries by deep sequencing and a genome-wide screen for Haemophilus genes required in the lung. Proc Natl Acad Sci U S A. 2009;106:16422–7. - PMC - PubMed

Publication types

MeSH terms