Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2024 Mar 1;26(3):223.
doi: 10.3390/e26030223.

Information Theoretic Study of COVID-19 Genome

Affiliations

Information Theoretic Study of COVID-19 Genome

Philippe Jacquet. Entropy (Basel). .

Abstract

In this paper, we analyse the genome sequence of COVID-19 on a information point of view, and we compare that with past and present genomes. We use the powerful tool of joint complexity in order to quantify the similarities measured between the various potential parent genomes. The tool has a computing complexity of several orders of magnitude below the classic Smith-Waterman algorithm and would allow it to be used on a larger scale.

Keywords: COVID-19; genome; joint complexity; pattern matching.

PubMed Disclaimer

Conflict of interest statement

The author declares no conflict of interest.

Figures

Figure 1
Figure 1
Joint complexity of SARS-2 genome with bat coronavirus alpha (solid), with random genomes (dashed).
Figure 2
Figure 2
Normalised joint complexity of SARS-2 genome with bat-α, and with HIV genome (red).
Figure 3
Figure 3
Joint complexity deviations of SARS-2 genome with the 19 HIV genomes.
Figure 4
Figure 4
Joint complexity deviations of the Reverse SARS-2 genome with the 19 HIV genomes.
Figure 5
Figure 5
Joint complexity deviations of HIV-2UC1 genome with other matchers.
Figure 6
Figure 6
Putative genealogical tree of SARS-2 COVID-19.
Figure 7
Figure 7
Joint complexity of bat-β genome with bat-α.
Figure 8
Figure 8
Joint complexity of bat-β with itself.
Figure 9
Figure 9
Joint complexity of bat-β genome with SARS-2 COVID-19 genome.
Figure 10
Figure 10
Joint complexity of bat-β with the RaCCS203 genome.
Figure 11
Figure 11
Local offset value of the bat-β genome with SARS-2 COVID-19 genome.
Figure 12
Figure 12
Local offset value of bat-β with with the RaCCS203 genome.
Figure 13
Figure 13
Mismatch rates. between the bat-β genome slices, corresponding SARS-2 genome slices (blue), and corresponding RaCCS203 genome slices (green).

References

    1. Smith T.F., Waterman M.S. Identification of Common Molecular Subsequences. J. Mol. Biol. 1981;147:195–197. doi: 10.1016/0022-2836(81)90087-5. - DOI - PubMed
    1. Altschul S.F., Gish W., Miller W., Myers E.W., Lipman D.J. Basic local alignment search tool. J. Mol. Biol. 1990;215:403–410. doi: 10.1016/S0022-2836(05)80360-2. - DOI - PubMed
    1. Jacquet P., Milioris D., Szpankowski W. Classification of Markov sources through joint string complexity: Theory and experiments; Proceedings of the 2013 IEEE International Symposium on Information Theory; Istanbul, Turkey. 7–12 July 2013; pp. 2289–2293.
    1. Milioris D. Topic Detection and Classification in Social Networks. Springer; Berlin/Heidelberg, Germany: 2018. Joint Sequence Complexity: Introduction and Theory; pp. 21–56.
    1. Burnside G., Milioris D., Jacquet P. One Day in Twitter: Topic Detection Via Joint Complexity; Proceedings of the SNOW 2014 Data Challenge; Seoul, Republic of Korea. 8 April 2014.

LinkOut - more resources