Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Comparative Study
. 2017 Apr 24;18(1):321.
doi: 10.1186/s12864-017-3658-x.

Comprehensive performance comparison of high-resolution array platforms for genome-wide Copy Number Variation (CNV) analysis in humans

Affiliations
Comparative Study

Comprehensive performance comparison of high-resolution array platforms for genome-wide Copy Number Variation (CNV) analysis in humans

Rajini R Haraksingh et al. BMC Genomics. .

Abstract

Background: High-resolution microarray technology is routinely used in basic research and clinical practice to efficiently detect copy number variants (CNVs) across the entire human genome. A new generation of arrays combining high probe densities with optimized designs will comprise essential tools for genome analysis in the coming years. We systematically compared the genome-wide CNV detection power of all 17 available array designs from the Affymetrix, Agilent, and Illumina platforms by hybridizing the well-characterized genome of 1000 Genomes Project subject NA12878 to all arrays, and performing data analysis using both manufacturer-recommended and platform-independent software. We benchmarked the resulting CNV call sets from each array using a gold standard set of CNVs for this genome derived from 1000 Genomes Project whole genome sequencing data.

Results: The arrays tested comprise both SNP and aCGH platforms with varying designs and contain between ~0.5 to ~4.6 million probes. Across the arrays CNV detection varied widely in number of CNV calls (4-489), CNV size range (~40 bp to ~8 Mbp), and percentage of non-validated CNVs (0-86%). We discovered strikingly strong effects of specific array design principles on performance. For example, some SNP array designs with the largest numbers of probes and extensive exonic coverage produced a considerable number of CNV calls that could not be validated, compared to designs with probe numbers that are sometimes an order of magnitude smaller. This effect was only partially ameliorated using different analysis software and optimizing data analysis parameters.

Conclusions: High-resolution microarrays will continue to be used as reliable, cost- and time-efficient tools for CNV analysis. However, different applications tolerate different limitations in CNV detection. Our study quantified how these arrays differ in total number and size range of detected CNVs as well as sensitivity, and determined how each array balances these attributes. This analysis will inform appropriate array selection for future CNV studies, and allow better assessment of the CNV-analytical power of both published and ongoing array-based genomics studies. Furthermore, our findings emphasize the importance of concurrent use of multiple analysis algorithms and independent experimental validation in array-based CNV detection studies.

Keywords: Array Comparative Genome Hybridization (aCGH); Copy Number Variation (CNV); SNP array.

PubMed Disclaimer

Figures

Fig. 1
Fig. 1
Size and nature of gold standard CNVs from sample NA12878. a. Histogram showing size distribution of NA12878 gold standard CNVs. Bin sizes change by a factor of ten across panels. Peaks in the size range of Alu elements, in the 301–400 bp bin, and in the size range of LINE1 elements, in the 5001–6000 bp bin, are indicated by arrows. b. Distribution of total numbers of gold standard deletions and duplications by size. It can reasonably be expected that most CNVs smaller than 1 kb in size are not detectable by the arrays in this study
Fig. 2
Fig. 2
CNV detection performance of each array using two different algorithms. a. Overlap of autosomal CNV calls from two different algorithms for each array platform with gold standard CNVs. Data is shown for each of two technical replicates per array. CNV call sets derived from the platform specific algorithm are in green, yellow, and red, and those derived from Nexus are in blue, pink, and purple. The number of array CNV calls overlapping a gold standard CNV by 50% reciprocally in size is in green and blue, by less than 50% reciprocally in size is in yellow and pink, and not overlapping a gold standard CNV is in red and purple. Array calls not overlapping a gold standard CNV at all were further analyzed for sequencing-based confirmation using CNVnator generated CNV calls based on the 1000 Genomes Project sequencing data for NA12878. The number of CNV calls not overlapping a gold standard CNV but with CNVnator support is shown as solid red or purple bars. The number of CNV calls not overlapping a gold standard CNV and with no CNVnator support is shown as hashed red or purple bars. b. Average rate of non-validated CNV calls for each array platform and for each algorithm. The rate of non-validated calls is calculated as the percentage of the total number of CNVs called from an array that do not overlap a gold standard CNV and do no have any supporting evidence from CNVnator (hashed red and purple bars in a.). Average rate of non-validated calls is based on two technical replicates
Fig. 3
Fig. 3
Detection of a 122 kb gold standard deletion on chromosome 19p by 17 arrays. Horizontal axis shows position along chromosome 19. Vertical axes show log R ratio of fluorescence of NA12878 DNA over fluorescence of reference DNA. Grey dots indicate probes that have not been called as part of a CNV. Red dots indicate probes that have been called as part of a CNV. Horizontal lines indicate Nexus cutoffs for low and high copy deletions (red) and duplications (blue). Gray dashed box indicates CNV region. Genes and segmental duplications (SegDups) are also shown. 1. Affymetrix SNP6.0, 2. Affymetrix CytoScanHD, 3. Agilent 1×1M-CGH, 4. Agilent 1×1M-HR, 5. Agilent 2×400K-CGH, 6. Agilent 2×400K-CNV, 7. Agilent 4×180K-CGH, 8. Illumina HumanOmni5Exome, 9. Illumina HumanOmni5, 10. Illumina HumanOmni2.5Exome, 11. Illumina HumanOmni2.5, 12. Illumina HumanOmni1Quad, 13. Illumina HumanOmniExpressExome, 14. Illumina HumanOmniExpress, 15. Illumina CoreExome, 16. Illumina CytoSNP-850, 17. Illumina Psych Array

Similar articles

Cited by

References

    1. Zhang Y, Haraksingh R, Grubert F, Abyzov A, Gerstein M, Weissman S, Urban AE. Child development and structural variation in the human genome. Child Dev. 2013;84(1):34–48. doi: 10.1111/cdev.12051. - DOI - PubMed
    1. Haraksingh RR, Abyzov A, Gerstein M, Urban AE, Snyder M. Genome-wide mapping of copy number variation in humans: comparative analysis of high resolution array platforms. PLoS One. 2011;6(11):e27859. doi: 10.1371/journal.pone.0027859. - DOI - PMC - PubMed
    1. Consortium GP A map of human genome variation from population-scale sequencing. Nature. 2010;467(7319):1061–1073. doi: 10.1038/nature09534. - DOI - PMC - PubMed
    1. Sudmant PH, Rausch T, Gardner EJ, Handsaker RE, Abyzov A, Huddleston J, Zhang Y, Ye K, Jun G, M. Hsi-Yang Fritz M, et al. An integrated map of structural variation in 2,504 human genomes. Nature. 2015;526(7571):75–81. - PMC - PubMed
    1. Laurent LC, Ulitsky I, Slavin I, Tran H, Schork A, Morey R, Lynch C, Harness JV, Lee S, Barrero MJ, et al. Dynamic changes in the copy number of pluripotency and cell proliferation genes in human ESCs and iPSCs during reprogramming and time in culture. Cell Stem Cell. 2011;8(1):106–118. doi: 10.1016/j.stem.2010.12.003. - DOI - PMC - PubMed

Publication types