Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2009 Sep;19(9):1586-92.
doi: 10.1101/gr.092981.109. Epub 2009 Aug 5.

Sensitive and accurate detection of copy number variants using read depth of coverage

Affiliations

Sensitive and accurate detection of copy number variants using read depth of coverage

Seungtai Yoon et al. Genome Res. 2009 Sep.

Abstract

Methods for the direct detection of copy number variation (CNV) genome-wide have become effective instruments for identifying genetic risk factors for disease. The application of next-generation sequencing platforms to genetic studies promises to improve sensitivity to detect CNVs as well as inversions, indels, and SNPs. New computational approaches are needed to systematically detect these variants from genome sequence data. Existing sequence-based approaches for CNV detection are primarily based on paired-end read mapping (PEM) as reported previously by Tuzun et al. and Korbel et al. Due to limitations of the PEM approach, some classes of CNVs are difficult to ascertain, including large insertions and variants located within complex genomic regions. To overcome these limitations, we developed a method for CNV detection using read depth of coverage. Event-wise testing (EWT) is a method based on significance testing. In contrast to standard segmentation algorithms that typically operate by performing likelihood evaluation for every point in the genome, EWT works on intervals of data points, rapidly searching for specific classes of events. Overall false-positive rate is controlled by testing the significance of each possible event and adjusting for multiple testing. Deletions and duplications detected in an individual genome by EWT are examined across multiple genomes to identify polymorphism between individuals. We estimated error rates using simulations based on real data, and we applied EWT to the analysis of chromosome 1 from paired-end shotgun sequence data (30x) on five individuals. Our results suggest that analysis of read depth is an effective approach for the detection of CNVs, and it captures structural variants that are refractory to established PEM-based methods.

PubMed Disclaimer

Figures

Figure 1.
Figure 1.
Pipeline for the detection of CNVs based on analysis of read depth (RD). (A) RD was determined by counting the start position of reads in nonoverlapping windows of 100 bp. (B) Events were detected using a custom CNV-calling algorithm, event-wise testing (EWT). (C) Each event was examined in multiple genomes in order to distinguish polymorphic events (CNVs) from the majority of events that were found to show a similar copy number change in all five genomes in this study (i.e., monomorphic events).
Figure 2.
Figure 2.
Illustration of the event-wise testing (EWT) method for detecting CNVs based on depth of coverage. Panel A illustrates the read depth by 100-bp window for a 15-kb (150 windows) genomic region in sample NA12891, where a 4.9-kb (49 windows) deletion was detected (chr1:157,227,901–157,232,800). The heatmap in B illustrates test results for all 100-bp windows of this region for each of the 19 event types (i.e., size 2, 3, 4,…, up to size 20) for deletion. The y-axis is event size (l). An orange dot represents a significant test result for an l-sized event, and a blue dot represents a nonsignificant test result.
Figure 3.
Figure 3.
Examples of CNVs detected by analysis of RD. We present four examples of polymorphic gains and losses detected by EWT in five individuals. The x-axis represents genomic coordinates (in Mbp) and the y-axis represents RD, which is median-normalized to copy number 2. In each panel, plots are for NA12878, NA12891, NA12892, NA18507, and YH from top to bottom. The coordinates of A, B, C, and D are chr1:150,792,101–150,884,101, chr1:103,930,401–104,053,201, chr1:205,319,001–205,399,701, and chr1:150,422,701–150,486,501, respectively.

Similar articles

Cited by

References

    1. Albertson DG, Pinkel D. Genomic microarrays in human genetic disease and cancer. Hum Mol Genet. 2003;12(Spec. no. 2):R145–R152. - PubMed
    1. Bentley DR, Balasubramanian S, Swerdlow HP, Smith GP, Milton J, Brown CG, Hall KP, Evers DJ, Barnes CL, Bignell HR, et al. Accurate whole human genome sequencing using reversible terminator chemistry. Nature. 2008;456:53–59. - PMC - PubMed
    1. Cahan P, Godfrey LE, Eis PS, Richmond TA, Selzer RR, Brent M, McLeod HL, Ley TJ, Graubert TA. wuHMM: A robust algorithm to detect DNA copy number variation using long oligonucleotide microarray data. Nucleic Acids Res. 2008;36:e41. doi: 10.1093/nar/gkn110. - DOI - PMC - PubMed
    1. Colella S, Yau C, Taylor JM, Mirza G, Butler H, Clouston P, Bassett AS, Seller A, Holmes CC, Ragoussis J. QuantiSNP: An objective Bayes hidden-Markov model to detect and accurately map copy number variation using SNP genotyping data. Nucleic Acids Res. 2007;35:2013–2025. - PMC - PubMed
    1. Cooper GM, Zerr T, Kidd JM, Eichler EE, Nickerson DA. Systematic assessment of copy number variant detection via genome-wide SNP genotyping. Nat Genet. 2008;40:1199–1203. - PMC - PubMed

Publication types