Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2024 Feb 1;40(2):btae066.
doi: 10.1093/bioinformatics/btae066.

VariantDetective: an accurate all-in-one pipeline for detecting consensus bacterial SNPs and SVs

Affiliations

VariantDetective: an accurate all-in-one pipeline for detecting consensus bacterial SNPs and SVs

Philippe Charron et al. Bioinformatics. .

Abstract

Motivation: Genomic variations comprise a spectrum of alterations, ranging from single nucleotide polymorphisms (SNPs) to large-scale structural variants (SVs), which play crucial roles in bacterial evolution and species diversification. Accurately identifying SNPs and SVs is beneficial for subsequent evolutionary and epidemiological studies. This study presents VariantDetective (VD), a novel, user-friendly, and all-in-one pipeline combining SNP and SV calling to generate consensus genomic variants using multiple tools.

Results: The VD pipeline accepts various file types as input to initiate SNP and/or SV calling, and benchmarking results demonstrate VD's robustness and high accuracy across multiple tested datasets when compared to existing variant calling approaches.

Availability and implementation: The source code, test data, and relevant information for VD are freely accessible at https://github.com/OLF-Bioinformatics/VariantDetective under the MIT License.

PubMed Disclaimer

Conflict of interest statement

None declared.

Figures

Figure 1.
Figure 1.
Performance and benchmarking of VD. (A, B) Benchmarking of VD against three SNP callers (A) and four SV callers (B) using 15 simulated datasets. SURVIVOR was used to simulate SNPs and different types of SVs in the assembly of representative L. acidophilus, E. coli, and B. mallei genomes. (C, D) Comparison of F1 scores of VD with 41 SNP calling pipelines (C). Overall accuracy of VD against two top SNP callers: DV and HC (D). F1 scores of 41 SNP calling pipelines, including DV and HC, were obtained from a previous study (Bush et al. 2020). (E, F) Performance of VD against four SV callers using benchmarking human SVs datasets, including recall (E) and the number of predicted SVs (F). Parameters used for each caller were taken from a previous benchmarking publication (Tham et al. 2020). The 2 or 3 callers represent the results obtained from a minimum consensus of two or three callers used to generate the final variant list. VD: VariantDetective; FB: Freebayes; HC: GATK HaplotypeCaller; CL: Clair3; NV: NanoVar; NS: NanoSV; CS: CuteSV; SI: SVIM; DV: DeepVariant.

Similar articles

Cited by

References

    1. Barbitoff YA, Abasov R, Tvorogova VE. et al. Systematic benchmark of state-of-the-art variant calling pipelines identifies major factors affecting accuracy of coding sequence variant discovery. BMC Genomics 2022;23:155. - PMC - PubMed
    1. Becker T, Lee W-P, Leone J. et al. FusorSV: an algorithm for optimally combining data from multiple structural variation detection methods. Genome Biol 2018;19:38. - PMC - PubMed
    1. Bush SJ, Foster D, Eyre DW. et al. Genomic diversity affects the accuracy of bacterial single-nucleotide polymorphism-calling pipelines. Gigascience 2020;9:giaa007. 10.1093/gigascience/giaa007. - DOI - PMC - PubMed
    1. Chiara M, Gioiosa S, Chillemi G. et al. CoVaCS: a consensus variant calling system. BMC Genomics 2018;19:120. - PMC - PubMed
    1. Chiliński M, Plewczynski D.. ConsensuSV-from the whole-genome sequencing data to the complete variant list. Bioinformatics 2022;38:5440–2. - PMC - PubMed

Publication types

MeSH terms