Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2021 May 20:12:673855.
doi: 10.3389/fmicb.2021.673855. eCollection 2021.

Genomic Epidemiology of SARS-CoV-2 From Mainland China With Newly Obtained Genomes From Henan Province

Affiliations

Genomic Epidemiology of SARS-CoV-2 From Mainland China With Newly Obtained Genomes From Henan Province

Ning Song et al. Front Microbiol. .

Abstract

Even though the COVID-19 epidemic in China has been successfully put under control within a few months, it is still very important to infer the origin time and genetic diversity from the perspective of the whole genome sequence of its agent, SARS-CoV-2. Yet, the sequence of the entire virus genome from China in the current public database is very unevenly distributed with reference to time and place of collection. In particular, only one sequence was obtained in Henan province, adjacent to China's worst-case province, Hubei Province. Herein, we used high-throughput sequencing techniques to get 19 whole-genome sequences of SARS-CoV-2 from 18 severe patients admitted to the First Affiliated Hospital of Zhengzhou University, a provincial designated hospital for the treatment of severe COVID-19 cases in Henan province. The demographic, baseline, and clinical characteristics of these patients were described. To investigate the molecular epidemiology of SARS-CoV-2 of the current COVID-19 outbreak in China, 729 genome sequences (including 19 sequences from this study) sampled from Mainland China were analyzed with state-of-the-art comprehensive methods, including likelihood-mapping, split network, ML phylogenetic, and Bayesian time-scaled phylogenetic analyses. We estimated that the evolutionary rate and the time to the most recent common ancestor (TMRCA) of SARS-CoV-2 from Mainland China were 9.25 × 10-4 substitutions per site per year (95% BCI: 6.75 × 10-4 to 1.28 × 10-3) and October 1, 2019 (95% BCI: August 22, 2019 to November 6, 2019), respectively. Our results contribute to studying the molecular epidemiology and genetic diversity of SARS-CoV-2 over time in Mainland China.

Keywords: Henan Province; Mainland China; SARS-CoV-2; evolutionary rate; tMRCA; whole-genome sequence.

PubMed Disclaimer

Conflict of interest statement

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Figures

Figure 1
Figure 1
Clade assignment of the 19 Henan sequences analyzed by the Nextclade. Currently, five major clades are defined: 19A and 19B emerged in Wuhan and have dominated the early outbreak; 20A emerged from 19A out of dominated the European outbreak in March and has since spread globally; 20B and 20C are large genetically distinct subclades 20A. The 19 Henan sequences are highlighted and marked with solid circles at the end of their branches.
Figure 2
Figure 2
Time series and geographic distribution of the 729 SARS-CoV-2 genomes from Mainland China by sampling date. The geographic distribution of the 729 SARS-CoV-2 genomes from Mainland China in the present study is shown at the provincial level. Colors indicate different sampling provinces from Mainland China.
Figure 3
Figure 3
Estimated maximum-likelihood phylogenetic tree of SARS-CoV-2 from Mainland China. Maximum-likelihood phylogenetic tree of SARS-CoV-2 for “dataset_729” from Mainland China is shown. Tree is midpoint rooted. Colors indicate different sampling provinces from Mainland China. The scale bar at the bottom indicates 0.00005 nucleotide substitutions per site.
Figure 4
Figure 4
Root-to-tip genetic divergence plot of SARS-CoV-2 from Mainland China. Root-to-tip genetic divergence for “dataset_729” from Mainland China in the Maximum likelihood tree (as shown in Figure 3) plotted against sampling date is shown. Colors indicate different sampling provinces from Mainland China. Gray color indicates linear regression line.
Figure 5
Figure 5
Estimated Bayesian time-scaled maximum-clade-credibility phylogenetic tree of SARS-CoV-2 from Mainland China. Circle at the tip is colored according to sampling provinces from Mainland China. Note that the axis of abscissas is scaled by decimal date.

References

    1. Darriba D., Taboada G. L., Doallo R., Posada D. (2012). jModelTest 2: more models, new heuristics and parallel computing. Nat. Methods 9:772. 10.1038/nmeth.2109 - DOI - PMC - PubMed
    1. Drummond A. J., Suchard M. A., Xie D., Rambaut A. (2012). Bayesian phylogenetics with BEAUti and the BEAST 1.7. Mol. Biol. Evol. 29, 1969–1973. 10.1093/molbev/mss075 - DOI - PMC - PubMed
    1. Elbe S., Buckland-Merrett G. (2017). Data, disease and diplomacy: GISAID's innovative contribution to global health. Glob. Chall. 1, 33–46. 10.1002/gch2.1018 - DOI - PMC - PubMed
    1. Felsenstein J. (1985). Confidence limits on phylogenies: an approach using the bootstrap. Evolution 39, 783–791. 10.1111/j.1558-5646.1985.tb00420.x - DOI - PubMed
    1. Ferreira M. R., Suchard M.A. (2008). Bayesian analysis of elapsed times in continuous-time Markov chains. Can. J. Stat. 36, 355–368. 10.1002/cjs.5550360302 - DOI