Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Review
. 2013 Oct;29(10):593-9.
doi: 10.1016/j.tig.2013.07.006. Epub 2013 Aug 22.

Finding the lost treasures in exome sequencing data

Affiliations
Review

Finding the lost treasures in exome sequencing data

David C Samuels et al. Trends Genet. 2013 Oct.

Abstract

Exome sequencing is one of the most cost-efficient sequencing approaches for conducting genome research on coding regions. However, significant portions of the reads obtained in exome sequencing come from outside of the designed target regions. These additional reads are generally ignored, potentially wasting an important source of genomic data. There are three major types of unintentionally sequenced read that can be found in exome sequencing data: reads in introns and intergenic regions, reads in the mitochondrial genome, and reads originating in viral genomes. All of these can be used for reliable data mining, extending the utility of exome sequencing. Large-scale exome sequencing data repositories, such as The Cancer Genome Atlas (TCGA), the 1000 Genomes Project, National Heart, Lung, and Blood Institute (NHLBI) Exome Sequencing Project, and The Sequence Reads Archive, provide researchers with excellent secondary data-mining opportunities to study genomic data beyond the intended target regions.

Keywords: exome capture; mitochondria; mtDNA copy number; unmapped read; virus; virus integration.

PubMed Disclaimer

Figures

Figure 1
Figure 1
Results of a PUBMED search for papers using the term ‘exome’, through 1 July, 2013 showing the rapid and recent spread of this sequencing method.
Figure 2
Figure 2
A flow diagram illustrating how off-target reads can be identified from exome-sequencing data. Currently available tools for the analysis of the different types of off-target read are given. Abbreviation: SNP, single nucleotide polymorphism.

References

    1. Ng SB, et al. Exome sequencing identifies the cause of a mendelian disorder. Nat Genet. 2010;42:30–35. - PMC - PubMed
    1. Durbin RM, et al. A map of human genome variation from population-scale sequencing. Nature. 2010;467:1061–1073. - PMC - PubMed
    1. Fu W, et al. Analysis of 6,515 exomes reveals the recent origin of most human protein-coding variants. Nature. 2013;493:216–220. - PMC - PubMed
    1. Sulonen AM, et al. Comparison of solution-based exome capture methods for next generation sequencing. Genome Biol. 2011;12:R94. - PMC - PubMed
    1. Guo Y, et al. Exome sequencing generates high quality data in non-target regions. BMC Genomics. 2012;13:194. - PMC - PubMed