K-Mer-Based Genome Size Estimation in Theory and Practice
- PMID: 37335470
- DOI: 10.1007/978-1-0716-3226-0_4
K-Mer-Based Genome Size Estimation in Theory and Practice
Abstract
Recent advances in sequencing technologies have made genome sequencing of non-model organisms with very large and complex genomes possible. The data can be used to estimate diverse genome characteristics, including genome size, repeat content, and levels of heterozygosity. K-mer analysis is a powerful biocomputational approach with a wide range of applications, including estimation of genome sizes. However, interpretation of the results is not always straightforward. Here, I review k-mer-based genome size estimation, focusing specifically on k-mer theory and peak calling in k-mer frequency histograms. I highlight common pitfalls in data analysis and result interpretation, and provide a comprehensive overview on current methods and programs developed to conduct these analyses.
Keywords: BB-tools; CovEST; FindGSE; GCE; GenomeScope; Jellyfish; KSA; Kmergenie; RESPECT.
© 2023. The Author(s), under exclusive license to Springer Science+Business Media, LLC, part of Springer Nature.
References
-
- Bennett MD, Leitch IJ (2005) Genome size evolution in plants. In: The evolution of the genome. Academic, pp 89–162 - DOI
-
- Gregory TR (2005) Genome size evolution in animals. In: The evolution of the genome. Academic, pp 3–87 - DOI
-
- Kullman B, Tamm H, Kullman K (2005) Fungal Genome Size Database
-
- Gregory TR (2021) Animal Genome Size Database http://www.genomesize.com
Publication types
MeSH terms
LinkOut - more resources
Full Text Sources
Miscellaneous
