Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2010 Feb;7(2):119-22.
doi: 10.1038/nmeth.1416. Epub 2010 Jan 17.

Parallel, tag-directed assembly of locally derived short sequence reads

Affiliations

Parallel, tag-directed assembly of locally derived short sequence reads

Joseph B Hiatt et al. Nat Methods. 2010 Feb.

Abstract

We demonstrate subassembly, an in vitro library construction method that extends the utility of short-read sequencing platforms to applications requiring long, accurate reads. A long DNA fragment library is converted to a population of nested sublibraries, and a tag sequence directs grouping of short reads derived from the same long fragment, enabling localized assembly of long fragment sequences. Subassembly may facilitate accurate de novo genome assembly and metagenome sequencing.

PubMed Disclaimer

Figures

Figure 1
Figure 1
Schematic of subassembly process. (a) Long DNA fragments are ligated to tag-adjacent adaptors, diluted and PCR-amplified. Dilution imposes a complexity bottleneck so that a limited number of long fragments are amplified. Concatemerized PCR products are then sheared by sonication and ligated to a breakpoint-adjacent adaptor. A second PCR amplification prepares amplicons for sequencing; one end of these amplicons corresponds to an end of a long fragment and the other end corresponds to a shearing breakpoint internal to that fragment. (b) Breakpoint reads are grouped in silico based on the sequence of the corresponding tag read. Breakpoint reads within a group, which derive from positions internal to the same parent long fragment, are subjected to local assembly to generate a subassembled read. (c) The metagenomic bottlenecked long-fragment library is subjected directly to paired-end Illumina sequencing to identify pairs of tag reads that were derived from opposite ends of the same original fragment. Two groups of breakpoint reads defined by distinct tag reads are merged and assembled together to generate one or more subassembled reads. In this study, this step was only applied to the metagenomic sample.
Figure 2
Figure 2
Evaluation of subassembly performance. (a) Distribution of subassembled (SA) read length for P. aeruginosa sample and for methylamine metagenomic sample for unmerged and merged pairs of tag-defined read groups. (b) Cumulative per-base substitution error rate of base calls binned as a function of descending base quality in raw and SA reads, or the error rate of the x% of bases with the highest quality scores, after using BLAST to define the corresponding sequence in the reference. (c) Substitution error rate of base calls as a function of base position in raw and SA reads (binned every 3 bases). (d) Total length in sequences longer than a variable cutoff produced from SA reads compared to a standard shotgun library for the 100–1,000 bp range in which metagenomic analyses become possible. SA reads and assembled SA reads were compared to assembly of 48-bp or 76-bp paired-end reads from a standard Illumina shotgun library using Velvet with optimized parameters and an equivalent amount of raw sequence. Assembled SA reads refers to contigs produced by CABOG from SA reads.

Comment in

  • Mind the gaps.
    Salzberg SL. Salzberg SL. Nat Methods. 2010 Feb;7(2):105-6. doi: 10.1038/nmeth0210-105. Nat Methods. 2010. PMID: 20111034 Free PMC article.

References

    1. Shendure J, Ji H. Nat. Biotechnol. 2008;26:1135–1145. - PubMed
    1. Hillier LW, et al. Nat. Methods. 2008;5:183–188. - PubMed
    1. Mortazavi A, Williams BA, McCue K, Schaeffer L, Wold B. Nat. Methods. 2008;5:621–628. - PubMed
    1. Hamady M, Knight R. Genome Res. 2009;19:1141–1152. - PMC - PubMed
    1. Simpson JT, et al. Genome Res. 2009;19:1117–1123. - PMC - PubMed

Publication types