Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2020 Nov:100:164-173.
doi: 10.1016/j.ijid.2020.08.066. Epub 2020 Aug 28.

Comprehensive evolution and molecular characteristics of a large number of SARS-CoV-2 genomes reveal its epidemic trends

Affiliations

Comprehensive evolution and molecular characteristics of a large number of SARS-CoV-2 genomes reveal its epidemic trends

Yunmeng Bai et al. Int J Infect Dis. 2020 Nov.

Abstract

Objectives: To further reveal the phylogenetic evolution and molecular characteristics of the whole genome of severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) based on a large number of genomes and provide a basis for the prevention and treatment of SARS-CoV-2.

Methods: Various evolution analysis methods were employed.

Results: The estimated ratio of the rates of non-synonymous to synonymous changes (Ka/Ks) of SARS-CoV-2 was 1.008 or 1.094 based on 622 or 3624 SARS-CoV-2 genomes and nine key specific sites of high linkage, and four major haplotypes were found: H1, H2, H3 and H4. The results of Ka/Ks, detected population size and development trends of each major haplotype showed that H3 and H4 subgroups were going through a purify evolution and almost disappeared after detection, indicating that they might have existed for a long time. The H1 and H2 subgroups were going through a near neutral or neutral evolution and globally increased with time, and the frequency of H1 was generally high in Europe and correlated with the death rate (r >0.37), suggesting that these two haplotypes might relate to the infectivity or pathogenicity of SARS-CoV-2.

Conclusions: Several key specific sites and haplotypes related to the infectivity or pathogenicity of SARS-CoV-2, and the possible earlier origin time and place of SARS-CoV-2 were indicated based on the evolution and epidemiology of 16,373 SARS-CoV-2 genomes.

Keywords: Classification; Evolution; Haplotype; SARS-CoV-2.

PubMed Disclaimer

Conflict of interest statement

Declaration of Competing Interest The authors report no declarations of interest.

Figures

Fig. 1
Fig. 1
Phylogenetic tree and clusters of 622 SARS-CoV-2 genomes. The 622 sequences were clustered into three clusters: Cluster 1 was mainly from North America, Cluster 2 was from regions all over the world, and Cluster 3 was mainly from Europe.
Fig. 2
Fig. 2
Linkage disequilibrium plot of haplotypes of the nine specific sites. A. The plot for 622 genome sequences; B. The plot for 3624 genome sequences.
Fig. 3
Fig. 3
The frequencies of both the nine specific sites and haplotypes. The frequencies of the nine specific sites (A) and haplotypes (B) in each country for 3624 genomes.
Fig. 4
Fig. 4
The characteristics of haplotype subgroups. A. The numbers of haplotypes of the nine specific sites for 16,373 genomes with clear collection data detected in each country in chronological order; B. The whole genome mutations in each major haplotype subgroup.
Fig. 5
Fig. 5
Phylogenetic network of haplotype subgroups for 3624 genomes. The network was inferred by POPART using the TCS method. Each colored vertex represents a haplotype, with different colors indicating the different sampling areas. Hatch marks along the edge indicate the number of mutations. Small black circles within the network indicate unsampled haplotypes. H1-H5 subgroups are pointed out according to haplotypes of the nine specific sites, and other small subgroups are not especially pointed out.
Fig. 6
Fig. 6
The correlation between death rates and frequencies of both the nine specific sites and haplotypes.

References

    1. Andersen K.G., Rambaut A., Lipkin W.I., Holmes E.C., Garry R.F. The proximal origin of SARS-CoV-2. Nature Med. 2020 doi: 10.1038/s41591-020-0820-0829. - DOI - PMC - PubMed
    1. Barrett J.C., Fry B., Maller J., Daly M.J. Haploview: analysis and visualization of LD and haplotype maps. Bioinformatics. 2005;21:263–265. - PubMed
    1. Bouckaert R., Vaughan T.G., Barido-Sottani J. BEAST 2.5: An advanced software platform for Bayesian evolutionary analysis. PLoS Comput Biol. 2019;15 - PMC - PubMed
    1. Chan Jf, Kok Kh, Zhu Z. Genomic characterization of the 2019 novel human-pathogenic coronavirus isolated from a patient with atypical pneumonia after visiting Wuhan. Emerg Microbes Infect. 2020;9:221–236. - PMC - PubMed
    1. Cotten M., Watson S.J., Zumla A.I. Spread, circulation, and evolution of the Middle East respiratory syndrome coronavirus. mBio. 2014;5 - PMC - PubMed