Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2020;101(3):1731-1750.
doi: 10.1007/s11071-020-05771-8. Epub 2020 Jul 4.

Computational analysis of the SARS-CoV-2 and other viruses based on the Kolmogorov's complexity and Shannon's information theories

Affiliations

Computational analysis of the SARS-CoV-2 and other viruses based on the Kolmogorov's complexity and Shannon's information theories

J A Tenreiro Machado et al. Nonlinear Dyn. 2020.

Abstract

This paper tackles the information of 133 RNA viruses available in public databases under the light of several mathematical and computational tools. First, the formal concepts of distance metrics, Kolmogorov complexity and Shannon information are recalled. Second, the computational tools available presently for tackling and visualizing patterns embedded in datasets, such as the hierarchical clustering and the multidimensional scaling, are discussed. The synergies of the common application of the mathematical and computational resources are then used for exploring the RNA data, cross-evaluating the normalized compression distance, entropy and Jensen-Shannon divergence, versus representations in two and three dimensions. The results of these different perspectives give extra light in what concerns the relations between the distinct RNA viruses.

Keywords: COVID-19; Hierarchical clustering; Kolmogorov complexity theory; Multidimensional scaling; Shannon information theory.

PubMed Disclaimer

Conflict of interest statement

Conflict of interestThe authors declare that they have no conflict of interest.

Figures

Fig. 1
Fig. 1
Coronavirus related to twenty-first-century epidemics. ICST: The International Committee on Taxonomy of Virus—Coronavirus Study Group, WHO: World Health Organization
Fig. 2
Fig. 2
Dendrogram for the set of 133 viruses using the NCD and zlib based on the Kolmogorov complexity theory
Fig. 3
Fig. 3
HC tree for the set of 133 viruses using the NCD and zlib based on the Kolmogorov complexity theory
Fig. 4
Fig. 4
MDS three-dimensional locus for the set of 133 viruses using the NCD and zlib based on the Kolmogorov complexity theory with the cluster of SARS-CoV-2 connected by a line
Fig. 5
Fig. 5
MDS three-dimensional locus for the set of 133 viruses using the NCD and zlib based on the Kolmogorov complexity theory, without point labels and the cluster of SARS-CoV-2 connected by a line
Fig. 6
Fig. 6
Shannon entropy H vs fractional cumulative residual entropy ε for the set of 133 viruses, with the cluster of SARS-CoV-2 connected by a line
Fig. 7
Fig. 7
Dendrogram for the set of 133 viruses using the JSD based on the Shannon information theory
Fig. 8
Fig. 8
HC tree for the set of 133 viruses using the JSD based on the Shannon information theory
Fig. 9
Fig. 9
MDS three-dimensional locus for the set of 133 viruses using the JSD based on the Shannon information theory, with the cluster of SARS-CoV-2 connected by a line
Fig. 10
Fig. 10
MDS three-dimensional locus for the set of 133 viruses using the JSD based on the Shannon information theory, without point labels and the cluster of SARS-CoV-2 connected by a line

Similar articles

Cited by

References

    1. Zhu N, Zhang D, Wang W, Li X, Yang B, Song J, Zhao X, Huang B, Shi W, Lu R, Niu P, Zhan F, Ma X, Wang D, Xu W, Wu G, Gao GF, Tan W. A novel coronavirus from patients with pneumonia in China, 2019. N. Engl. J. Med. 2020;382(8):727–733. doi: 10.1056/nejmoa2001017. - DOI - PMC - PubMed
    1. ur Rehman S, Shafique L, Ihsan A, Liu Q. Evolutionary trajectory for the emergence of novel coronavirus SARS-CoV-2. Pathogens. 2020;9(3):240. doi: 10.3390/pathogens9030240. - DOI - PMC - PubMed
    1. Kandeil A, Shehata MM, Shesheny RE, Gomaa MR, Ali MA, Kayali G. Complete genome sequence of middle east respiratory syndrome coronavirus isolated from a dromedary camel in Egypt. Genome Announc. 2016 doi: 10.1128/genomea.00309-16. - DOI - PMC - PubMed
    1. Kucharski AJ, Russell TW, Diamond C, Liu Y, Edmunds J, Funk S, Eggo RM, Sun F, Jit M, Munday JD, Davies N, Gimma A, van Zandvoort K, Gibbs H, Hellewell J, Jarvis CI, Clifford S, Quilty BJ, Bosse NI, Abbott S, Klepac P, Flasche S. Early dynamics of transmission and control of COVID-19: a mathematical modelling study. Lancet Infect. Dis. 2020 doi: 10.1016/s1473-3099(20)30144-4. - DOI - PMC - PubMed
    1. Lam TTY, Shum MHH, Zhu HC, Tong YG, Ni XB, Liao YS, Wei W, Cheung WYM, Li WJ, Li LF, Leung GM, Holmes EC, Hu YL, Guan Y. Identifying SARS-CoV-2 related coronaviruses in Malayan pangolins. Nature. 2020 doi: 10.1038/s41586-020-2169-0. - DOI - PubMed

LinkOut - more resources