K-mer natural vector and its application to the phylogenetic analysis of genetic sequences
- PMID: 24858075
- PMCID: PMC4096558
- DOI: 10.1016/j.gene.2014.05.043
K-mer natural vector and its application to the phylogenetic analysis of genetic sequences
Abstract
Based on the well-known k-mer model, we propose a k-mer natural vector model for representing a genetic sequence based on the numbers and distributions of k-mers in the sequence. We show that there exists a one-to-one correspondence between a genetic sequence and its associated k-mer natural vector. The k-mer natural vector method can be easily and quickly used to perform phylogenetic analysis of genetic sequences without requiring evolutionary models or human intervention. Whole or partial genomes can be handled more effective with our proposed method. It is applied to the phylogenetic analysis of genetic sequences, and the obtaining results fully demonstrate that the k-mer natural vector method is a very powerful tool for analysing and annotating genetic sequences and determining evolutionary relationships both in terms of accuracy and efficiency.
Keywords: K-mer model; Natural vector; Phylogenetic analysis.
Copyright © 2014 Elsevier B.V. All rights reserved.
Figures






Similar articles
-
Phylogenetic analysis of protein sequences based on a novel k-mer natural vector method.Genomics. 2019 Dec;111(6):1298-1305. doi: 10.1016/j.ygeno.2018.08.010. Epub 2018 Sep 5. Genomics. 2019. PMID: 30195069
-
A novel method of characterizing genetic sequences: genome space with biological distance and applications.PLoS One. 2011 Mar 2;6(3):e17293. doi: 10.1371/journal.pone.0017293. PLoS One. 2011. PMID: 21399690 Free PMC article.
-
kmer2vec: A Novel Method for Comparing DNA Sequences by word2vec Embedding.J Comput Biol. 2022 Sep;29(9):1001-1021. doi: 10.1089/cmb.2021.0536. Epub 2022 May 20. J Comput Biol. 2022. PMID: 35593919
-
[MtDNA-like sequences and the coordination of the functioning of mammalian nuclear and mitochondrial genomes].Tsitol Genet. 1997 Mar-Apr;31(2):53-61. Tsitol Genet. 1997. PMID: 9157643 Review. Russian.
-
Mammalian phylogenomics comes of age.Trends Genet. 2004 Dec;20(12):631-9. doi: 10.1016/j.tig.2004.09.005. Trends Genet. 2004. PMID: 15522459 Review.
Cited by
-
FastGT: an alignment-free method for calling common SNVs directly from raw sequencing reads.Sci Rep. 2017 May 31;7(1):2537. doi: 10.1038/s41598-017-02487-5. Sci Rep. 2017. PMID: 28566690 Free PMC article.
-
DeepRice6mA: A convolutional neural network approach for 6mA site prediction in the rice Genome.PLoS One. 2025 Jun 18;20(6):e0325216. doi: 10.1371/journal.pone.0325216. eCollection 2025. PLoS One. 2025. PMID: 40531834 Free PMC article.
-
Advancing microbial diagnostics: a universal phylogeny guided computational algorithm to find unique sequences for precise microorganism detection.Brief Bioinform. 2024 Sep 23;25(6):bbae545. doi: 10.1093/bib/bbae545. Brief Bioinform. 2024. PMID: 39441245 Free PMC article.
-
New Virus Variant Detection Based on the Optimal Natural Metric.Genes (Basel). 2024 Jul 7;15(7):891. doi: 10.3390/genes15070891. Genes (Basel). 2024. PMID: 39062670 Free PMC article.
-
Classification of Neisseria meningitidis genomes with a bag-of-words approach and machine learning.iScience. 2024 Feb 16;27(3):109257. doi: 10.1016/j.isci.2024.109257. eCollection 2024 Mar 15. iScience. 2024. PMID: 38439962 Free PMC article.
References
-
- Ausio J, Soley JT, Burger W, Lewis JD, Barreda D, Cheng KM. The histidine-rich protamine from ostrich and tinamou sperm: A link between reptile and bird protamines. Biochemistry. 1999;38:180–184. - PubMed
-
- Berry MW, Drmac Z, Jessup ER. Matrices, vector spaces, and information retrieval. SIAM Rewiew. 1999;41:335–362.
Publication types
MeSH terms
Substances
Grants and funding
LinkOut - more resources
Full Text Sources
Other Literature Sources
Miscellaneous