Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2009;4(6):960-74.
doi: 10.1038/nprot.2009.68. Epub 2009 May 28.

Hybrid selection of discrete genomic intervals on custom-designed microarrays for massively parallel sequencing

Affiliations

Hybrid selection of discrete genomic intervals on custom-designed microarrays for massively parallel sequencing

Emily Hodges et al. Nat Protoc. 2009.

Abstract

Complementary techniques that deepen information content and minimize reagent costs are required to realize the full potential of massively parallel sequencing. Here, we describe a resequencing approach that directs focus to genomic regions of high interest by combining hybridization-based purification of multi-megabase regions with sequencing on the Illumina Genome Analyzer (GA). The capture matrix is created by a microarray on which probes can be programmed as desired to target any non-repeat portion of the genome, while the method requires only a basic familiarity with microarray hybridization. We present a detailed protocol suitable for 1-2 microg of input genomic DNA and highlight key design tips in which high specificity (>65% of reads stem from enriched exons) and high sensitivity (98% targeted base pair coverage) can be achieved. We have successfully applied this to the enrichment of coding regions, in both human and mouse, ranging from 0.5 to 4 Mb in length. From genomic DNA library production to base-called sequences, this procedure takes approximately 9-10 d inclusive of array captures and one Illumina flow cell run.

PubMed Disclaimer

Figures

Figure 1
Figure 1
Schematic diagram of the array capture protocol. High-molecular-weight genomic DNA is fragmented by sonication. The fragments are subjected to a series of enzymatic reactions that repair frayed ends, restore phosphorylation and add a single 3′ adenine overhang. The fragments are symmetrically ligated to an adaptor comprising two partially complementary oligos (colored circles indicate distinct adaptor oligo sequences). After ligation, the DNA is gel-purified and size selected for 150–300 bp. Ligated fragments are PCR enriched with primers corresponding to the adaptor ends (colored arrows represent PCR primers). A total of 8–10 reactions are carried out to obtain 20 µg of amplified DNA for hybridization and to ensure that all hybridized fragments contain detectable ends for forming sequence clusters on the Illumina instrument. The PCR products are added to a cocktail of Cot-1 DNA, blocking oligos and hybridization buffer, and the 244k tiling arrays are hybridized for several days. After hybridization, the arrays are stringently washed and captured fragments are eluted in water at 95 °C. The eluted material is lyophilized, and recovered strands either undergo further PCR amplifications or are added directly to the Illumina Genome Analyzer flow cell for cluster formation.
Figure 2
Figure 2
Sources of cross-hybridization that influence specificity. Two scenarios are depicted in which unintended fragments can be purified along with targeted sequences. The first illustration shows the potential hybridization between a fragment of repetitive DNA (shown in black) and the probe-bound DNA fragment (shown in orange). The second illustration shows adaptor (shown in blue) complementation between two unrelated DNA inserts.
Figure 3
Figure 3
Elution strategy showing chamber assembly and syringe method. To perform the elution step, nuclease-free water is added to the gasket slide surface and the array is placed with the printed side down (a). The chamber base is assembled, tightened and placed in the rotating oven at 95 °C. After the captured strands are melted, the eluted material is recovered by turning the screw 1/4 turn (b). The chamber is hot and must be handled with care by holding it with a rubber grip (d). There are two different ends to the chamber base, a narrow end (shown by arrows) and the end marked by the array labels (c). The liquid can be seen through the chamber base from the side opposite the screw through which a large air bubble is visible (c). The chamber base is tilted so that the liquid shifts toward the label end and the air bubble is near the narrow end (d). The syringe is inserted through the space between the array and gasket slides into the air bubble at the narrow end (d). The chamber is then tilted back towards the syringe (e). The liquid eluate is carefully withdrawn from the compartment into the syringe and transferred to a 1.7-ml maximum recovery centrifuge tube.
Figure 4
Figure 4
qPCR analysis of two selected exons. qPCRs were carried out in triplicate for individually selected exons on both the input material (pre-hyb genomic DNA) and the enriched material (post-hyb). The differential CT values between ‘pre’ and ‘post’ array capture give a clear indication that the purification of the exons has succeeded. Non-selected exons may also be included (data not shown) in this analysis to confirm depletion of unselected targets.
Figure 5
Figure 5
Coverage plot. Target coverage at the base pair level versus sequencing depth is plotted for each of the experiments detailed in Table 3. For all four experiments, significant breadth of coverage is achieved at or around 5×, while similarly high sequence complexity is maintained as exemplified by the corresponding curve behaviors that follow characteristics reasonably reflective of the classic Lander and Waterman curve. In all cases, 98% base pair coverage is achieved. The 2% bases not covered by reads likely reflect regions of the genome that are inaccessible or ‘unmappable’ at 36 bp.

References

    1. Kaiser J. DNA sequencing. A plan to capture human diversity in 1000 genomes. Science. 2008;319:395. - PubMed
    1. Siva N. 1000 Genomes Project. Nat. Biotechnol. 2008;26:256. - PubMed
    1. Comprehensive genomic characterization defines human glioblastoma genes and core pathways. Nature. 2008;455:1061–1068. - PMC - PubMed
    1. Collins FS, Barker AD. Mapping the cancer genome. Pinpointing the genes involved in cancer will help chart anew course across the complex landscape of human malignancies. Sci. Am. 2007;296:50–57. - PubMed
    1. Levy S, et al. The diploid genome sequence of an individual human. PLoS Biol. 2007;5:e254. - PMC - PubMed

Publication types

MeSH terms