Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2021;51(5):3086-3103.
doi: 10.1007/s10489-021-02193-w. Epub 2021 Feb 17.

Using artificial intelligence techniques for COVID-19 genome analysis

Affiliations

Using artificial intelligence techniques for COVID-19 genome analysis

M Saqib Nawaz et al. Appl Intell (Dordr). 2021.

Abstract

The genome of the novel coronavirus (COVID-19) disease was first sequenced in January 2020, approximately a month after its emergence in Wuhan, capital of Hubei province, China. COVID-19 genome sequencing is critical to understanding the virus behavior, its origin, how fast it mutates, and for the development of drugs/vaccines and effective preventive strategies. This paper investigates the use of artificial intelligence techniques to learn interesting information from COVID-19 genome sequences. Sequential pattern mining (SPM) is first applied on a computer-understandable corpus of COVID-19 genome sequences to see if interesting hidden patterns can be found, which reveal frequent patterns of nucleotide bases and their relationships with each other. Second, sequence prediction models are applied to the corpus to evaluate if nucleotide base(s) can be predicted from previous ones. Third, for mutation analysis in genome sequences, an algorithm is designed to find the locations in the genome sequences where the nucleotide bases are changed and to calculate the mutation rate. Obtained results suggest that SPM and mutation analysis techniques can reveal interesting information and patterns in COVID-19 genome sequences to examine the evolution and variations in COVID-19 strains respectively.

Keywords: COVID-19; Genome sequence; Mutation; Nucleotide bases; Sequential pattern mining.

PubMed Disclaimer

Conflict of interest statement

Conflict of InterestsThe authors declare that they have no conflict of interest.

Figures

Fig. 1
Fig. 1
SARS-CoV-2 Structure [21]
Fig. 2
Fig. 2
Structure of the SARS-CoV-2 genome [24]
Fig. 3
Fig. 3
Proposed SPM and sequence prediction approach for analyzing COVID-19 genome sequences
Fig. 4
Fig. 4
Sequential rules discovered in a genome sequence by ERMiner
Fig. 5
Fig. 5
COVID-19 genome mutation in whole sequences
Fig. 6
Fig. 6
COVID-19 genome mutation

References

    1. Wu F, et al. A new coronavirus associated with human respiratory disease in China. Nature. 2020;579:265–269. doi: 10.1038/s41586-020-2008-3. - DOI - PMC - PubMed
    1. Sohrabi C, et al. World Health Organization declares global emergency: A review of the 2019 novel coronavirus (COVID-19) Intern J Surge. 2020;76:71–76. doi: 10.1016/j.ijsu.2020.02.034. - DOI - PMC - PubMed
    1. Cucinotta D, Vanelli M. WHO declares COVID-19 a pandemic. Acta Biomed. 2020;91:157–160. - PMC - PubMed
    1. WHO (Accessed on December 6, 2020) WHO coronavirus disease (COVID-19) dashboard
    1. Mousavizadeha L, Ghasemi S (2020) Genotype and phenotype of COVID-19: Their roles in pathogenesis. J Microb Immuno Infect. 10.1016/j.jmii.2020.03.022 - PMC - PubMed

LinkOut - more resources