Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2020 Jun;92(6):602-611.
doi: 10.1002/jmv.25731. Epub 2020 Mar 11.

Evolutionary history, potential intermediate animal host, and cross-species analyses of SARS-CoV-2

Affiliations

Evolutionary history, potential intermediate animal host, and cross-species analyses of SARS-CoV-2

Xingguang Li et al. J Med Virol. 2020 Jun.

Abstract

To investigate the evolutionary history of the recent outbreak of severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) in China, a total of 70 genomes of virus strains from China and elsewhere with sampling dates between 24 December 2019 and 3 February 2020 were analyzed. To explore the potential intermediate animal host of the SARS-CoV-2 virus, we reanalyzed virome data sets from pangolins and representative SARS-related coronaviruses isolates from bats, with particular attention paid to the spike glycoprotein gene. We performed phylogenetic, split network, transmission network, likelihood-mapping, and comparative analyses of the genomes. Based on Bayesian time-scaled phylogenetic analysis using the tip-dating method, we estimated the time to the most recent common ancestor and evolutionary rate of SARS-CoV-2, which ranged from 22 to 24 November 2019 and 1.19 to 1.31 × 10-3 substitutions per site per year, respectively. Our results also revealed that the BetaCoV/bat/Yunnan/RaTG13/2013 virus was more similar to the SARS-CoV-2 virus than the coronavirus obtained from the two pangolin samples (SRR10168377 and SRR10168378). We also identified a unique peptide (PRRA) insertion in the human SARS-CoV-2 virus, which may be involved in the proteolytic cleavage of the spike protein by cellular proteases, and thus could impact host range and transmissibility. Interestingly, the coronavirus carried by pangolins did not have the RRAR motif. Therefore, we concluded that the human SARS-CoV-2 virus, which is responsible for the recent outbreak of COVID-19, did not come directly from pangolins.

Keywords: COVID-19; SARS-CoV-2; TMRCA; cross-species transmission; evolutionary rate; potential intermediate animal host.

PubMed Disclaimer

Conflict of interest statement

The authors declare that there are no conflict of interests.

Figures

Figure 1
Figure 1
Likelihood‐mapping and split network analyses of severe acute respiratory syndrome coronavirus 2 (SARS‐CoV‐2). Likelihoods of three tree topologies for each possible quartet (or for a random sample of quartets) are denoted by data points in an equilateral triangle. Distribution of points in seven areas of the triangle reflects tree‐likeness of data. Specifically, three corners represent fully resolved tree topologies; center represents an unresolved (star) phylogeny; and sides represent support for conflicting tree topologies. Results of likelihood‐mapping analyses of two data sets (“dataset_70,” A; and “dataset_6,” B) and split network analyses of “dataset_70” (C) are shown
Figure 2
Figure 2
Estimated maximum‐likelihood phylogenies of severe acute respiratory syndrome coronavirus 2 (SARS‐CoV‐2). Colors indicate different sampling locations. Tree is midpoint rooted. Results of maximum‐likelihood phylogenetic analyses of “dataset_70” are shown
Figure 3
Figure 3
Regression of root‐to‐tip genetic distance against year of sampling for severe acute respiratory syndrome coronavirus 2 (SARS‐CoV‐2). Colors indicate different sampling locations. Gray indicates linear regression line. Results of linear regression analyses of “dataset_70” are shown
Figure 4
Figure 4
Estimated maximum‐clade‐credibility tree of severe acute respiratory syndrome coronavirus 2 (SARS‐CoV‐2) using tip‐dating method. Colors indicate different sampling locations. Nodes are labeled with posterior probability values. Estimated maximum‐clade credibility (MCC) tree of “dataset_70” are shown
Figure 5
Figure 5
Transmission clusters of severe acute respiratory syndrome coronavirus 2 (SARS‐CoV‐2). Structure of inferred SARS‐CoV‐2 transmission clusters from “dataset_70” using genetic distances of <0.001% substitutions/site is shown. Nodes (circles) represent connected individuals in overall network, and putative transmission linkages are represented by edges (lines). Nodes are color‐coded by sampling locations

Similar articles

Cited by

References

    1. Chan JFW, Yuan S, Kok KH, et al. A familial cluster of pneumonia associated with the 2019 novel coronavirus indicating person‐to‐person transmission: a study of a family cluster. Lancet. 2020;395:514‐523. 10.1016/S0140-6736(20)30154-9 - DOI - PMC - PubMed
    1. Li Q, Guan X, Wu P, et al. Early transmission dynamics in Wuhan, China, of novel coronavirus‐infected pneumonia. N Engl J Med. 2020:NEJMoa2001316. 10.1056/NEJMoa2001316 - DOI - PMC - PubMed
    1. Su S, Wong G, Shi W, et al. Epidemiology, genetic recombination, and pathogenesis of coronaviruses. Trends Microbiol. 2016;24:490‐502. 10.1016/j.tim.2016.03.003 - DOI - PMC - PubMed
    1. Drosten C, Günther S, Preiser W, et al. Identification of a novel coronavirus in patients with severe acute respiratory syndrome. N Engl J Med. 2003;348:1967‐1976. 10.1056/NEJMoa030747 - DOI - PubMed
    1. Ksiazek TG, Erdman D, Goldsmith CS, et al. A novel coronavirus associated with severe acute respiratory syndrome. N Engl J Med. 2003;348:1953‐1966. 10.1056/NEJMoa030781 - DOI - PubMed

Publication types

MeSH terms

Substances