KATK: Fast genotyping of rare variants directly from unmapped sequencing reads
- PMID: 33715282
- DOI: 10.1002/humu.24197
KATK: Fast genotyping of rare variants directly from unmapped sequencing reads
Abstract
KATK is a fast and accurate software tool for calling variants directly from raw next-generation sequencing reads. It uses predefined k-mers to retrieve only the reads of interest from the FASTQ file and calls genotypes by aligning retrieved reads locally. KATK does not use data about known polymorphisms and has NC (no call) as the default genotype. The reference or variant allele is called only if there is sufficient evidence for their presence in data. Thus it is not biased against rare variants or de-novo mutations. With simulated datasets, we achieved a false-negative rate of 0.23% (sensitivity 99.77%) and a false discovery rate of 0.19%. Calling all human exonic regions with KATK requires 1-2 h, depending on sequencing coverage.
Keywords: de-novo mutations; k-mers; mutation discovery; next-generation sequencing; rare mutations.
© 2021 Wiley Periodicals LLC.
References
REFERENCES
-
- 1000 Genomes Project Consortium, Auton, A., Brooks, L. D., Durbin, R. M., Garrison, E. P., Kang, H. M., Korbel, J. O., Marchini, J. L., McCarthy, S., McVean, G. A., & Abecasis, G. R. (2015). A global reference for human genetic variation. Nature, 526(7571), 68-74. https://doi.org/10.1038/nature15393
-
- Audano, P. A., Ravishankar, S., & Vannberg, F. O. (2018). Mapping-free variant calling using haplotype reconstruction from k-mer frequencies. Bioinformatics, 34(10), 1659-1665. https://doi.org/10.1093/bioinformatics/btx753
-
- Berger, M. F., & Mardis, E. R. (2018). The emerging clinical relevance of genomics in cancer medicine. Nature Reviews Clinical Oncology, 15(6), 353-365. https://doi.org/10.1038/s41571-018-0002-6
-
- Buchan, J. G., White, S., Joshi, R., & Ashley, E. A. (2019). Rapid genome sequencing in the critically ill. Clinical Chemistry, 65(6), 723-726. https://doi.org/10.1373/clinchem.2018.293506
-
- Cirulli, E. T., White, S., Read, R. W., Elhanan, G., Metcalf, W. J., Tanudjaja, F., Fath, D. M., Sandoval, E., Isaksson, M., Schlauch, K. A., Grzymski, J. J., Lu, J. T., & Washington, N. L. (2020). Genome-wide rare variant analysis for thousands of phenotypes in over 70,000 exomes from two cohorts. Nature Communications, 11(1), 542. https://doi.org/10.1038/s41467-020-14288-y
Publication types
MeSH terms
LinkOut - more resources
Full Text Sources
Other Literature Sources
Miscellaneous