Targeted enrichment beyond the consensus coding DNA sequence exome reveals exons with higher variant densities
- PMID: 21787409
- PMCID: PMC3218830
- DOI: 10.1186/gb-2011-12-7-r68
Targeted enrichment beyond the consensus coding DNA sequence exome reveals exons with higher variant densities
Abstract
Background: Enrichment of loci by DNA hybridization-capture, followed by high-throughput sequencing, is an important tool in modern genetics. Currently, the most common targets for enrichment are the protein coding exons represented by the consensus coding DNA sequence (CCDS). The CCDS, however, excludes many actual or computationally predicted coding exons present in other databases, such as RefSeq and Vega, and non-coding functional elements such as untranslated and regulatory regions. The number of variants per base pair (variant density) and our ability to interrogate regions outside of the CCDS regions is consequently less well understood.
Results: We examine capture sequence data from outside of the CCDS regions and find that extremes of GC content that are present in different subregions of the genome can reduce the local capture sequence coverage to less than 50% relative to the CCDS. This effect is due to biases inherent in both the Illumina and SOLiD sequencing platforms that are exacerbated by the capture process. Interestingly, for two subregion types, microRNA and predicted exons, the capture process yields higher than expected coverage when compared to whole genome sequencing. Lastly, we examine the variation present in non-CCDS regions and find that predicted exons, as well as exonic regions specific to RefSeq and Vega, show much higher variant densities than the CCDS.
Conclusions: We show that regions outside of the CCDS perform less efficiently in capture sequence experiments. Further, we show that the variant density in computationally predicted exons is more than 2.5-times higher than that observed in the CCDS.
Figures







Similar articles
-
The GENCODE exome: sequencing the complete human exome.Eur J Hum Genet. 2011 Jul;19(7):827-31. doi: 10.1038/ejhg.2011.28. Epub 2011 Mar 2. Eur J Hum Genet. 2011. PMID: 21364695 Free PMC article.
-
A comparative analysis of exome capture.Genome Biol. 2011 Sep 29;12(9):R97. doi: 10.1186/gb-2011-12-9-r97. Genome Biol. 2011. PMID: 21958622 Free PMC article.
-
Comparison of solution-based exome capture methods for next generation sequencing.Genome Biol. 2011 Sep 28;12(9):R94. doi: 10.1186/gb-2011-12-9-r94. Genome Biol. 2011. PMID: 21955854 Free PMC article.
-
Exome versus transcriptome sequencing in identifying coding region variants.Expert Rev Mol Diagn. 2012 Apr;12(3):241-51. doi: 10.1586/erm.12.10. Expert Rev Mol Diagn. 2012. PMID: 22468815 Review.
-
Not all exons are protein coding: Addressing a common misconception.Cell Genom. 2023 Apr 12;3(4):100296. doi: 10.1016/j.xgen.2023.100296. eCollection 2023 Apr 12. Cell Genom. 2023. PMID: 37082142 Free PMC article. Review.
Cited by
-
SNP discovery in European anchovy (Engraulis encrasicolus, L) by high-throughput transcriptome and genome sequencing.PLoS One. 2013 Aug 1;8(8):e70051. doi: 10.1371/journal.pone.0070051. Print 2013. PLoS One. 2013. PMID: 23936375 Free PMC article.
-
High-throughput sequencing in mitochondrial DNA research.Mitochondrion. 2014 Jul;17:157-63. doi: 10.1016/j.mito.2014.05.004. Epub 2014 May 20. Mitochondrion. 2014. PMID: 24859348 Free PMC article. Review.
-
Genetic errors of immunity distinguish pediatric nonmalignant lymphoproliferative disorders.J Allergy Clin Immunol. 2022 Feb;149(2):758-766. doi: 10.1016/j.jaci.2021.07.015. Epub 2021 Jul 28. J Allergy Clin Immunol. 2022. PMID: 34329649 Free PMC article.
-
An open access pilot freely sharing cancer genomic data from participants in Texas.Sci Data. 2016 Feb 16;3:160010. doi: 10.1038/sdata.2016.10. Sci Data. 2016. PMID: 26882539 Free PMC article.
-
A Novel Mutation in Insulin-Like Growth Factor 1 Receptor (c.641-2A>G) Is Associated with Impaired Growth, Hypoglycemia, and Modified Immune Phenotypes.Horm Res Paediatr. 2020;93(5):322-334. doi: 10.1159/000510764. Epub 2020 Oct 28. Horm Res Paediatr. 2020. PMID: 33113547 Free PMC article.
References
Publication types
MeSH terms
Grants and funding
LinkOut - more resources
Full Text Sources
Other Literature Sources
Miscellaneous