Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2013 Mar 1;41(5):e64.
doi: 10.1093/nar/gks1346. Epub 2013 Jan 4.

ParseCNV integrative copy number variation association software with quality tracking

Affiliations

ParseCNV integrative copy number variation association software with quality tracking

Joseph T Glessner et al. Nucleic Acids Res. .

Abstract

A number of copy number variation (CNV) calling algorithms exist; however, comprehensive software tools for CNV association studies are lacking. We describe ParseCNV, unique software that takes CNV calls and creates probe-based statistics for CNV occurrence in both case-control design and in family based studies addressing both de novo and inheritance events, which are then summarized based on CNV regions (CNVRs). CNVRs are defined in a dynamic manner to allow for a complex CNV overlap while maintaining precise association region. Using this approach, we avoid failure to converge and non-monotonic curve fitting weaknesses of programs, such as CNVtools and CNVassoc, and although Plink is easy to use, it only provides combined CNV state probe-based statistics, not state-specific CNVRs. Existing CNV association methods do not provide any quality tracking information to filter confident associations, a key issue which is fully addressed by ParseCNV. In addition, uncertainty in CNV calls underlying CNV associations is evaluated to verify significant results, including CNV overlap profiles, genomic context, number of probes supporting the CNV and single-probe intensities. When optimal quality control parameters are followed using ParseCNV, 90% of CNVs validate by polymerase chain reaction, an often problematic stage because of inadequate significant association review. ParseCNV is freely available at http://parsecnv.sourceforge.net.

PubMed Disclaimer

Figures

Figure 1.
Figure 1.
CNV Analysis Workflow. Pre-processing, file formats and post-processing. This general framework shows the stepwise procedure to prepare input data to use and evaluate ParseCNV output. ‘…’ represents additional columns not shown.
Figure 2.
Figure 2.
Possible statistical contingency table definitions to capture CNV frequency difference in cases versus control subjects. The middle statistical definition of deletions signifying loss of function mutations and duplications signifying gain of function mutations is used predominantly. This is in contrast to a view that all CNVs are similarly detrimental put forth by the top statistical definition and the view that all CNV states lead to a unique outcome put forth by the bottom statistical definition.
Figure 3.
Figure 3.
Complex CNV Overlap and CNVR definition examples. Rectangles represent individual sample CNV call boundaries as provided by a CNV calling algorithm. Each assayed point represented by the probe framework listed in the map file input determines the possible boundary assignments. The CNVR definition assigned by ParseCNV is shown as a dashed box. Small variance in individual CNV call boundaries allows extension of CNVR definition. CNV peninsula is shown as the most common false-positive result based on variable extension of CNV boundary (typically the region common to cases and controls has many probes, whereas the case only extension has few probes).
Figure 4.
Figure 4.
Quantitative PCR validation of CNVR associations. Each sample with attempted validation for a specific CNV at a specific locus is shown. The validation data output is 0.5 for deletions, 1 for diploid, 1.5 for duplications with standard error values from triplicate runs.
Figure 5.
Figure 5.
Sampling of different settings of distance (1 MB) and significance (±1 power of 10 P-value). Based on 785 cases versus 1110 control subjects and 561 308 probes data set. By this sampling procedure, we show these defaults are justifiable based on balancing CNVR extension to allow boundary variability while maintaining unique loci, except in rare instances. The x-axis shows the CNVR typed and distance setting. The colour shows the P-value variance setting. The y-axis shows the count CNVRs resulting from these settings.
Figure 6.
Figure 6.
Increased frequency of specific CNV state in cases. chr14:104241048–104348254 4:0 (case:control) deletions 2:11 duplications 6:11 combined ParseCNV provides case enriched deletion significance for this region P = 0.03 (duplication control enriched P = 0.09). As Plink only uses combined count definition, the P = 1 and the region is missed. chr11:133663955–133715739 1:3 deletions 5:0 duplications 6:3 combined ParseCNV provides case enriched duplication significance for this region P = 0.01 (deletion control enriched P = 0.65). As Plink only uses combined count definition, the P = 0.12 and the region is missed.

References

    1. Lee JA, Lupski JR. Genomic rearrangements and gene copy-number alterations as a cause of nervous system disorders. Neuron. 2006;52:103–121. - PubMed
    1. Girirajan S, Campbell CD, Eichler EE. Human copy number variation and complex genetic disease. Annu. Rev. Genet. 2011;45:203–226. - PMC - PubMed
    1. Conrad DF, Andrews TD, Carter NP, Hurles ME, Pritchard JK. A high resolution survey of deletion polymorphism in the human genome. Nat. Genet. 2006;38:75–81. - PubMed
    1. Peiffer DA, Le JM, Steemers FJ, Chang W, Jenniges T, Garcia F, Haden K, Li J, Shaw CA, Belmont J, et al. High-resolution genomic profiling of chromosomal aberrations using Infinium whole-genome genotyping. Genome Res. 2006;16:1136–1148. - PMC - PubMed
    1. Wang K, Li M, Hadley D, Liu R, Glessner J, Grant SF, Hakonarson H, Bucan M. PennCNV: an integrated hidden Markov model designed for high-resolution copy number variation detection in whole-genome SNP genotyping data. Genome Res. 2007;17:1665–1674. - PMC - PubMed

Publication types