Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2009 Jan;6(1):99-103.
doi: 10.1038/nmeth.1276. Epub 2008 Nov 30.

High-resolution mapping of copy-number alterations with massively parallel sequencing

Affiliations

High-resolution mapping of copy-number alterations with massively parallel sequencing

Derek Y Chiang et al. Nat Methods. 2009 Jan.

Abstract

Cancer results from somatic alterations in key genes, including point mutations, copy-number alterations and structural rearrangements. A powerful way to discover cancer-causing genes is to identify genomic regions that show recurrent copy-number alterations (gains and losses) in tumor genomes. Recent advances in sequencing technologies suggest that massively parallel sequencing may provide a feasible alternative to DNA microarrays for detecting copy-number alterations. Here we present: (i) a statistical analysis of the power to detect copy-number alterations of a given size; (ii) SegSeq, an algorithm to segment equal copy numbers from massively parallel sequence data; and (iii) analysis of experimental data from three matched pairs of tumor and normal cell lines. We show that a collection of approximately 14 million aligned sequence reads from human cell lines has comparable power to detect events as the current generation of DNA microarrays and has over twofold better precision for localizing breakpoints (typically, to within approximately 1 kilobase).

PubMed Disclaimer

Figures

Fig. 1
Fig. 1. Theoretical coverage required to detect single copy gains and losses
(a) Schematic overview for detecting copy-number alterations by sequencing. (b,c) Power calculations to detect copy-number alterations for a single copy gain and loss. We considered fixed windows L ranging from L = 10 kb to L = 100 kb. Lines indicate approximated power based on the distribution of ratios of normally-distributed random variables. For L = 30 kb we plotted simulation results for ratios of Poisson-distributed random variables (cyan dots). The approximation is accurate to within 10% (cyan dotted lines) for windows with average number of reads λ greater than 80 (dashed black line).
Fig. 2
Fig. 2. Segmentation algorithm for aligned sequenced reads
(a-c) Schematic overview of segmentation algorithm. (a) Candidate breakpoints (red dots) correspond to tumor read positions (black dots) whose local log-ratio statistic, D, passes a lenient significance threshold. (b) These candidate breakpoints define the boundaries of the initial copy-number segments (blue lines). Each point represents the estimated copy-number ratio for a 100 kb window. (c) A merging procedure yields the final list of copy-number segments (green lines) obtained for 10 genome-wide false positives. (d-e) Sensitivity to detect copy-number alterations as a function of the local window size parameter, w. A copy-number alteration of a particular size is introduced into a diploid genome sampled by 12 million aligned reads. Each line represents the fraction of 1000 spike-in simulations for which (d) a copy-number gain or (e) a copy-number loss was correctly identified by the segmentation algorithm.
Fig. 3
Fig. 3. Mapping the chromosomal breakpoints of homozygous deletions
(a-c) Breakpoint mapping with aligned sequence reads at: (a) the UTRN locus; (b) the PTPRD locus; or (c) the HS3T3A1 locus. Each point represents the location of a sequence read aligning to the NCI-H2347 (blue) tumor cell line or its matched normal, NCI-BL2347 (black). Vertical green lines indicate the exact chromosomal breakpoints mapped by sequencing of a PCR product spanning each homozygous deletion. For each breakpoint, we report the difference between the predicted and actual breakpoint positions. (d-f) Breakpoint mapping with an Affymetrix SNP 6.0 Array, where each point represents the log2copy-number ratio interrogated by an array probeset in: (d) the UTRN locus; (e) the PTPRD locus; or (f) the HS3T3A1 locus. The minimum value for log2copy-number ratios was set to -7. Horizontal blue lines represent copy-number segments inferred by the circular binary segmentation algorithm.

Similar articles

Cited by

References

    1. Freeman JL, et al. Copy number variation: New insights in genome diversity. Genome Res. 2006;16:949–961. - PubMed
    1. McCarroll SA, Altshuler DM. Copy-number variation and association studies of human disease. Nat. Genet. 2007;39:S37–S42. - PubMed
    1. Beckmann JS, Estivill X, Antonarakis SE. Copy number variants and genetic traits: closer to the resolution of phenotypic to genotypic variability. Nat. Rev. Genet. 2007;8:639–646. - PubMed
    1. Beroukhim R, et al. Assessing the significance of chromosomal aberrations in cancer: methodology and application to glioma. Proc. Natl. Acad. Sci. USA. 2007;104:20007–20012. - PMC - PubMed
    1. Pinkel D, Albertson DG. Array comparative genomic hybridization and its applications in cancer. Nat. Genet. 2005;37:S11–S17. - PubMed

Publication types