Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2025 Jul 16:27:3147-3154.
doi: 10.1016/j.csbj.2025.07.011. eCollection 2025.

CaMutQC: An R package for integrative quality control and filtration of cancer somatic mutations

Affiliations

CaMutQC: An R package for integrative quality control and filtration of cancer somatic mutations

Xin Wang et al. Comput Struct Biotechnol J. .

Abstract

The quality control and filtration of cancer somatic mutations (CAMs), including the elimination of false positives due to technical bias and the selection of key mutation candidates, are crucial steps for downstream analysis in cancer genomics. However, due to diverse needs and the lack of standardized filtering criteria, the filtering strategies applied vary from study to study, often resulting in reduced efficiency, accuracy, and reproducibility. Here, we present CaMutQC, a heuristic quality control and soft-filtering R/Bioconductor package designed specifically for CAMs. CaMutQC enables users to remove false positive mutations, select potential mutation candidates, and estimate Tumor Mutation Burden (TMB) with a single line of code, using either default or customized parameters. A filter report and a code log can also be generated after the filtration process to facilitate reproducibility and comparison. The application of CaMutQC to a Whole-exome Sequencing (WES) benchmark dataset demonstrated its strong capability by eliminating 85.55 % of false positive Single nucleotide variants (SNVs) while retaining 90.72 % of true positive SNVs. Additionally, an additional 11.56 % of true positive SNVs were rescued through CaMutQC's built-in union strategy. Similar results were observed for Insertions and Deletions (INDELs). CaMutQC is freely available through Bioconductor at https://bioconductor.org/packages/CaMutQC/ under the GPL v3 license.

Keywords: CaMutQC; Cancer somatic mutation; False positive mutation; Mutation filtering; Quality control.

PubMed Disclaimer

Conflict of interest statement

No potential conflict of interest was reported by the authors.

Figures

None
Graphical abstract
Fig. 1
Fig. 1
A. An overview of CaMutQC. B. CaMutQC connects the upstream and downstream of variant calling by taking annotated variant calls, filtering them, and passing them to downstream tools.
Fig. 2
Fig. 2
Benchmark results of WES_EA_1. Broken lines were used at the top of the y-axis for better displaying the main data range. A. TP and FP SNV counts after different procedures. The number of FP SNVs has greatly decreased after CaMutQC filtration with/without internal filtration. B. TP and FP INDEL counts after different procedures. raw: raw counts with no filtration; C: filtered by CaMutQC (CaMutQC-mutFilterTech); internal: processed by callers’ internal filter.
Fig. 3
Fig. 3
Benchmark results of sample WGS_EA_1. A. TP and FP SNV counts after different procedures. B. TP and FP INDEL counts after different procedures. C: filtered by CaMutQC (CaMutQC-mutFilterTech); internal: processed by callers’ internal filter.
Fig. 4
Fig. 4
An example of the updated union strategy. A. An illustration of the union strategy applied in this study. The union of CaMutQC-filtered (CaMutQC-mutFilterTech) variants from three callers, MuTect2, VarScan2, and MuSE, was obtained. B. Union results of SNVs in WES_EA_1. The number of TPs increases by 18.36 % on average among the three callers.

Similar articles

References

    1. Futreal P.A., Coin L., Marshall M., Down T., Hubbard T., et al. A census of human cancer genes. Nat Rev Cancer. 2004;4(3):177–183. doi: 10.1038/nrc1299. - DOI - PMC - PubMed
    1. Yi S., Lin S., Li Y., Zhao W., Mills G.B., et al. Functional variomics and network perturbation: connecting genotype to phenotype in cancer. Nat Rev Genet. 2017;18(7):395–410. doi: 10.1038/nrg.2017.8. - DOI - PMC - PubMed
    1. Greenman C., Stephens P., Smith R., Dalgliesh G.L., Hunter C., et al. Patterns of somatic mutation in human cancer genomes. Nature. 2007;446(7132):153–158. doi: 10.1038/nature05610. - DOI - PMC - PubMed
    1. Hedegaard J., Thorsen K., Lund M.K., Hein A.-M.K., Hamilton-Dutoit S.J., et al. Next-generation sequencing of RNA and DNA isolated from paired fresh-frozen and formalin-fixed paraffin-embedded samples of human cancer and normal tissue. PloS One. 2014;9(5) doi: 10.1371/journal.pone.0098187. - DOI - PMC - PubMed
    1. Do H., Dobrovic A. Sequence artifacts in DNA from formalin-fixed tissues: causes and strategies for minimization. Clin Chem. 2015;61(1):64–71. doi: 10.1373/clinchem.2014.223040. - DOI - PubMed

LinkOut - more resources