Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2014 Feb;24(2):310-7.
doi: 10.1101/gr.162883.113. Epub 2013 Dec 4.

TIGRA: a targeted iterative graph routing assembler for breakpoint assembly

Affiliations

TIGRA: a targeted iterative graph routing assembler for breakpoint assembly

Ken Chen et al. Genome Res. 2014 Feb.

Abstract

Recent progress in next-generation sequencing has greatly facilitated our study of genomic structural variation. Unlike single nucleotide variants and small indels, many structural variants have not been completely characterized at nucleotide resolution. Deriving the complete sequences underlying such breakpoints is crucial for not only accurate discovery, but also for the functional characterization of altered alleles. However, our current ability to determine such breakpoint sequences is limited because of challenges in aligning and assembling short reads. To address this issue, we developed a targeted iterative graph routing assembler, TIGRA, which implements a set of novel data analysis routines to achieve effective breakpoint assembly from next-generation sequencing data. In our assessment using data from the 1000 Genomes Project, TIGRA was able to accurately assemble the majority of deletion and mobile element insertion breakpoints, with a substantively better success rate and accuracy than other algorithms. TIGRA has been applied in the 1000 Genomes Project and other projects and is freely available for academic use.

PubMed Disclaimer

Figures

Figure 1.
Figure 1.
Schematic view of TIGRA. (A) Reads (arrow-shaped boxes) at a breakpoint (vertical dashed line in the center), including those normally mapped (gray), mate-unmapped (gray with red outline), soft-clipped (multicolored), and interchromosomally mapped (colored) are extracted from BAM files and sent to the assembly algorithm. (B) A de Bruijn graph is constructed using an iterative multiple-k-mer assembly algorithm. A contig (oval indexed node) with a specified length and average k-mer coverage (x) is connected to other contigs if it overlaps other contigs by k-1 bp (edge) in a particular orientation (arrow), and is of a particular coverage (weight). In this example, a mobile element insertion (of C2) with homology regions (C1) is successfully assembled. Two contig strings are decoded from the graph by TIGRA, representing two alternative alleles.
Figure 2.
Figure 2.
Comparison of assembly success rate at various allele frequencies in 45 CEU samples. Six assemblers are plotted: TIGRA (purple), Velvet (blue), SGA (cyan), SGA.all (yellow), Phrap (red), and SPAdes (brown). Allele frequencies (x-axis) are derived from the deletion genotypes released by The 1000 Genomes Project Consortium, and the fraction of success (y-axis) is estimated from 245 control deletion sites.

References

    1. The 1000 Genomes Project Consortium 2010. A map of human genome variation from population-scale sequencing. Nature 467: 1061–1073 - PMC - PubMed
    1. The 1000 Genomes Project Consortium 2012. An integrated map of genetic variation from 1,092 human genomes. Nature 491: 56–65 - PMC - PubMed
    1. Abyzov A, Urban AE, Snyder M, Gerstein M 2011. CNVnator: An approach to discover, genotype, and characterize typical and atypical CNVs from family and population genome sequencing. Genome Res 21: 974–984 - PMC - PubMed
    1. Alkan C, Sajjadian S, Eichler EE 2011. Limitations of next-generation genome sequence assembly. Nat Methods 8: 61–65 - PMC - PubMed
    1. Bankevich A, Nurk S, Antipov D, Gurevich AA, Dvorkin M, Kulikov AS, Lesin VM, Nikolenko SI, Pham S, Prjibelski AD, et al. 2012. SPAdes: A new genome assembly algorithm and its applications to single-cell sequencing. J Comput Biol 19: 455–477 - PMC - PubMed

Publication types