Comparing full variation profile analysis with the conventional consensus method in SARS-CoV-2 phylogeny
- PMID: 38920083
- PMCID: PMC11199993
- DOI: 10.1093/bib/bbae296
Comparing full variation profile analysis with the conventional consensus method in SARS-CoV-2 phylogeny
Abstract
This study proposes a novel approach to studying severe acute respiratory syndrome coronavirus 2 virus mutations through sequencing data comparison. Traditional consensus-based methods, which focus on the most common nucleotide at each position, might overlook or obscure the presence of low-frequency variants. Our method, in contrast, retains all sequenced nucleotides at each position, forming a genomic matrix. Utilizing simulated short reads from genomes with specified mutations, we contrasted our genomic matrix approach with the consensus sequence method. Our matrix methodology, across multiple simulated datasets, accurately reflected the known mutations with an average accuracy improvement of 20% over the consensus method. In real-world tests using data from GISAID and NCBI-SRA, our approach demonstrated an increase in reliability by reducing the error margin by approximately 15%. The genomic matrix approach offers a more accurate representation of the viral genomic diversity, thereby providing superior insights into virus evolution and epidemiology.
Keywords: NGS sequencing; SARS-CoV-2; full variation profile analysis; viral variants.
© The Author(s) 2024. Published by Oxford University Press.
Figures
References
-
- Gribskov M, Veretnik S. Identification of sequence patterns with profile analysis. In: Methods in Enzymology. Academic Press, 1996, Vol. 266, pp. 198–212. - PubMed
Publication types
MeSH terms
Supplementary concepts
Grants and funding
LinkOut - more resources
Full Text Sources
Medical
Miscellaneous
