Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2019 Apr 1;8(4):giz040.
doi: 10.1093/gigascience/giz040.

Duphold: scalable, depth-based annotation and curation of high-confidence structural variant calls

Affiliations

Duphold: scalable, depth-based annotation and curation of high-confidence structural variant calls

Brent S Pedersen et al. Gigascience. .

Abstract

Most structural variant (SV) detection methods use clusters of discordant read-pair and split-read alignments to identify variants yet do not integrate depth of sequence coverage as an additional means to support or refute putative events. Here, we present "duphold," a new method to efficiently annotate SV calls with sequence depth information that can add (or remove) confidence to SVs that are predicted to affect copy number. Duphold indicates not only the change in depth across the event but also the presence of a rapid change in depth relative to the regions surrounding the break-points. It uses a unique algorithm that allows the run time to be nearly independent of the number of variants. This performance is important for large, jointly called projects with many samples, each of which must be evaluated at thousands of sites. We show that filtering on duphold annotations can greatly improve the specificity of SV calls. Duphold can annotate SV predictions made from both short-read and long-read sequencing datasets. It is available under the MIT license at https://github.com/brentp/duphold.

Keywords: algorithm; genomics; structural variation.

PubMed Disclaimer

Figures

Figure 1:
Figure 1:
Evaluation of duphold on duplications and deletions of any size. We annotated 805 GiaB insertion calls as duplications and simulated homozygous reference (Hom. Ref.) events of similar size in order to evaluate the specificity and sensitivity of duphold. We show the distribution of DHFFC (duphold flank fold-change) for each genotype (homozygous reference [0/0] is blue, heterozygous [Het.] [0/1] is orange, and homozygous alternate [Hom. Alt.] [1/1] is green), for both duplications (A) and deletions (C). We then used those distributions to create receiver operating characteristic curves (B and D) and calculate AUCs that indicate the ability of duphold to differentiate 0/0 from 0/1 (orange) and 1/1 (green). The dots on the curves indicate a cutoff of 1.3 for duplications and 0.7 for deletions.
Figure 2:
Figure 2:
Duphold scalability. The time to annotate (or genotype) for duphold and svtyper is shown (y-axis) as a function of the number of variants tested (x-axis). While svtyper (blue) exhibits a linear increase in type with the number of variants, duphold is relatively independent of the number of variants. There is an initial cost that makes the duphold strategy less efficient for few (less than ∼10,000) variants, but it scales well to annotating thousands of variants as we expect for large cohorts.

References

    1. Layer RM, Chiang C, Quinlan AR, et al. .. LUMPY: a probabilistic framework for structural variant discovery. Genome Biol. 2014;15:R84. - PMC - PubMed
    1. Kronenberg ZN, Osborne EJ, Cone KR, et al. .. Wham: identifying structural variants of biological consequence. PLoS Comput Biol. 2015;11:e1004572. - PMC - PubMed
    1. Rausch T, Zichner T, Schlattl A, et al. .. DELLY: structural variant discovery by integrated paired-end and split-read analysis. Bioinformatics. 2012;28:i333–9. - PMC - PubMed
    1. Chen K, Wallis JW, McLellan MD, et al. .. BreakDancer: an algorithm for high-resolution mapping of genomic structural variation. Nat Methods. 2009;6:677–81. - PMC - PubMed
    1. Chen X, Schulz-Trieglaff O, Shaw R, et al. .. Manta: rapid detection of structural variants and indels for germline and cancer sequencing applications. Bioinformatics. 2016;32:1220–2. - PubMed

Publication types