Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Comparative Study
. 2015 Jan 28:16:17.
doi: 10.1186/s12859-014-0428-5.

Quantitative analysis of differences in copy numbers using read depth obtained from PCR-enriched samples and controls

Affiliations
Comparative Study

Quantitative analysis of differences in copy numbers using read depth obtained from PCR-enriched samples and controls

Frank Reinecke et al. BMC Bioinformatics. .

Abstract

Background: Next-generation sequencing (NGS) is rapidly becoming common practice in clinical diagnostics and cancer research. In addition to the detection of single nucleotide variants (SNVs), information on copy number variants (CNVs) is of great interest. Several algorithms exist to detect CNVs by analyzing whole genome sequencing data or data from samples enriched by hybridization-capture. PCR-enriched amplicon-sequencing data have special characteristics that have been taken into account by only one publicly available algorithm so far.

Results: We describe a new algorithm named quandico to detect copy number differences based on NGS data generated following PCR-enrichment. A weighted t-test statistic was applied to calculate probabilities (p-values) of copy number changes. We assessed the performance of the method using sequencing reads generated from reference DNA with known CNVs, and we were able to detect these variants with 98.6% sensitivity and 98.5% specificity which is significantly better than another recently described method for amplicon sequencing. The source code (R-package) of quandico is licensed under the GPLv3 and it is available at https://github.com/reineckef/quandico .

Conclusion: We demonstrated that our new algorithm is suitable to call copy number changes using data from PCR-enriched samples with high sensitivity and specificity even for single copy differences.

PubMed Disclaimer

Figures

Figure 1
Figure 1
Dispersion correction. Illustration of the dispersion correction effect by ϕ. First row: before correction, second row: after correction. Calls for the sequencing data sets (M62 and M63) and both control samples (NA12898, NA19129) are plotted separately (in columns). A CNV is called if the determined Q score is higher than the threshold (normalized to 1 in this diagram). False classifications (FN: false negative, FP: false positive) are shown in red. Loci with known CNVs in the sample are shown as dots and loci with normal copy number are plotted as crosses.
Figure 2
Figure 2
ROC curve. Receiver operator characteristic curves for ONCOCNV (o) and our own development using the comparison dataset (Table 2): Plain t-test statistics without corrections (naive, n) compared to the performance achieved by using weights (w) or removing outliers (r). The final algorithm quandico (q) includes both of these steps together with an additional dispersion balancing. Symbols are plotted for local maxima. The inset shows a magnification of the region above 90% specificity and sensitivity.
Figure 3
Figure 3
False positive/negative rates. False positive rate (FPR) and false negative rates (FNR) observed on the comparison dataset. The optimal threshold for every algorithm was determined by selecting the value that generated the minimal sum of FPR and FNR. The scores for every individual algorithm (x-axis) were then divided by the identified threshold (normalized to 1) for comparison. For algorithm details, see legend of Figure 2.

References

    1. Stratton MR, Campbell PJ, Futreal PA. The cancer genome. Nature. 2009;458:719–724. doi: 10.1038/nature07943. - DOI - PMC - PubMed
    1. Beroukhim R, Mermel CH, Porter D, Wei G, Raychaudhuri S, Donovan J, et al. The landscape of somatic copy-number alteration across human cancers. Nature. 2010;463:899–905. doi: 10.1038/nature08822. - DOI - PMC - PubMed
    1. Forozan F, Karhu R, Kononen J, Kallioniemi A, Kallioniemi OP. Genome screening by comparative genomic hybridization. Trends Genet. 1997;13:405–409. doi: 10.1016/S0168-9525(97)01244-4. - DOI - PubMed
    1. Hupé P. Stransky N, Thiery J-P, Radvanyi F, Barillot E. Analysis of array CGH data: from signal ratio to gain and loss of DNA regions. Bioinformatics. 2004;20:3413–3422. doi: 10.1093/bioinformatics/bth418. - DOI - PubMed
    1. Colella S, Yau C, Taylor JM, Mirza G, Butler H, Clouston P, et al. QuantiSNP: an objective bayes hidden-markov model to detect and accurately map copy number variation using SNP genotyping data. Nucleic Acids Res. 2007;35:2013–2025. doi: 10.1093/nar/gkm076. - DOI - PMC - PubMed

Publication types

MeSH terms