Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2012 May 20:13:194.
doi: 10.1186/1471-2164-13-194.

Exome sequencing generates high quality data in non-target regions

Affiliations

Exome sequencing generates high quality data in non-target regions

Yan Guo et al. BMC Genomics. .

Abstract

Background: Exome sequencing using next-generation sequencing technologies is a cost efficient approach to selectively sequencing coding regions of human genome for detection of disease variants. A significant amount of DNA fragments from the capture process fall outside target regions, and sequence data for positions outside target regions have been mostly ignored after alignment.

Result: We performed whole exome sequencing on 22 subjects using Agilent SureSelect capture reagent and 6 subjects using Illumina TrueSeq capture reagent. We also downloaded sequencing data for 6 subjects from the 1000 Genomes Project Pilot 3 study. Using these data, we examined the quality of SNPs detected outside target regions by computing consistency rate with genotypes obtained from SNP chips or the Hapmap database, transition-transversion (Ti/Tv) ratio, and percentage of SNPs inside dbSNP. For all three platforms, we obtained high-quality SNPs outside target regions, and some far from target regions. In our Agilent SureSelect data, we obtained 84,049 high-quality SNPs outside target regions compared to 65,231 SNPs inside target regions (a 129% increase). For our Illumina TrueSeq data, we obtained 222,171 high-quality SNPs outside target regions compared to 95,818 SNPs inside target regions (a 232% increase). For the data from the 1000 Genomes Project, we obtained 7,139 high-quality SNPs outside target regions compared to 1,548 SNPs inside target regions (a 461% increase).

Conclusions: These results demonstrate that a significant amount of high quality genotypes outside target regions can be obtained from exome sequencing data. These data should not be ignored in genetic epidemiology studies.

PubMed Disclaimer

Figures

Figure 1
Figure 1
Average depth around boundaries of target regions (1-50 bp inside and 1-200 bp outside boundaries). Negative distance means inside a target region, and positive distance means outside a target region.
Figure 2
Figure 2
Distributions of depth, mapping quality score, and base quality score for “Inside TR”, “Outside ≤200 bp”, and “Outside > 200 bp”.
Figure 3
Figure 3
Distribution of sites with a minimum depth of 5 to 10.
Figure 4
Figure 4
Average SNP count per sample, heterozygote consistency, and Ti/Tv ratio.

References

    1. Ng SB. et al.Targeted capture and massively parallel sequencing of 12 human exomes. Nature. 2009;461(7261):272–276. doi: 10.1038/nature08250. - DOI - PMC - PubMed
    1. Ng SB. et al.Exome sequencing identifies the cause of a mendelian disorder. Nat Genet. 2010;42(1):30–35. doi: 10.1038/ng.499. - DOI - PMC - PubMed
    1. Rearick D. et al.Critical association of ncRNA with introns. Nucleic Acids Res. 2011;39(6):2357–2366. doi: 10.1093/nar/gkq1080. - DOI - PMC - PubMed
    1. Yi X. et al.Sequencing of 50 human exomes reveals adaptation to high altitude. Science. 2010;329(5987):75–78. doi: 10.1126/science.1190371. - DOI - PMC - PubMed
    1. Hancock DB. et al.Genome-wide association study implicates chromosome 9q21.31 as a susceptibility locus for asthma in mexican children. PLoS genetics. 2009;5(8):e1000623. doi: 10.1371/journal.pgen.1000623. - DOI - PMC - PubMed

Publication types

LinkOut - more resources