Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2020 Dec 10;10(1):21617.
doi: 10.1038/s41598-020-78703-6.

Genomic recombination events may reveal the evolution of coronavirus and the origin of SARS-CoV-2

Affiliations

Genomic recombination events may reveal the evolution of coronavirus and the origin of SARS-CoV-2

Zhenglin Zhu et al. Sci Rep. .

Erratum in

Abstract

To trace the evolution of coronaviruses and reveal the possible origin of the severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2), which causes the coronavirus disease 2019 (COVID-19), we collected and thoroughly analyzed 29,452 publicly available coronavirus genomes, including 26,312 genomes of SARS-CoV-2 strains. We observed coronavirus recombination events among different hosts including 3 independent recombination events with statistical significance between some isolates from humans, bats and pangolins. Consistent with previous records, we also detected putative recombination between strains similar or related to Bat-CoV-RaTG13 and Pangolin-CoV-2019. The putative recombination region is located inside the receptor-binding domain (RBD) of the spike glycoprotein (S protein), which may represent the origin of SARS-CoV-2. Population genetic analyses provide estimates suggesting that the putative introduced DNA within the RBD is undergoing directional evolution. This may result in the adaptation of the virus to hosts. Unsurprisingly, we found that the putative recombination region in S protein was highly diverse among strains from bats. Bats harbor numerous coronavirus subclades that frequently participate in recombination events with human coronavirus. Therefore, bats may provide a pool of genetic diversity for the origin of SARS-CoV-2.

PubMed Disclaimer

Conflict of interest statement

The authors declare no competing interests.

Figures

Figure 1
Figure 1
Verification of the three recombination events from phylogenetic trees. (A) Whole genome phylogenetic tree. (B) Phylogenetic tree built by sequences in RI_RNA_ORF1. (C) Phylogenetic tree built by sequences in RI_RNA_Boundary. (D) Phylogenetic tree built by sequences in RI_RNA_S. The trees were built using strains related to recombination and related outgroups. The names of the coronavirus to which the strains belong are listed to the right of the phylogenetic tree. The numbers marked in red are the marginal likelihoods of the tree. The trees were built by Mega using the Jukes-Cantor model. Phylogeny tests were performed using the bootstrap method with 5000 replicates.
Figure 2
Figure 2
A sketch of the three recombination events and population genetic analysis results for RI_RNA_S. (A) Coordinate positions or positions of three recombinationally integrated RNA regions (indicated out by orange dotted lines) in the genome of SARS-CoV-2 (MN908947), with major proteins marked. ‘a’, ’b’ and ‘c’ refer to RI_RNA_ORF1, RI_RNA_Boundary and RI_RNA_S, respectively. Yellow represents the RBD in S protein. Red arrows with lines indicate the direction of transcription in SARS-CoV-2. (B) Diagram depicting a possible origin of SARS-CoV-2. (C) Snapshot of sliding window analysis of Fst (between coronaviruses from human and bat, human and pangolin, human and camel, human and cow as well as bat and pangolin). The region of RI_RNA_S is marked by a red rectangle. In the legend to the right, peaks at RI_RNA_S that are statistically significant (with values higher than the 0.05 threshold in the nearby region) are marked with ‘**’, and those with weak significance (with values higher than the 0.1 threshold in the nearby region) are marked with ‘*’. (D) Comparison of the distributions of Fst in RI_NA_S (red) and the nearby region (background, blue). Pairs of distributions in RI_NA_S and the flanking region were compared by the Wilcoxon rank sum test and a P-value is given. Vertical dashed lines denote the 0.05 cutoff (red) and 0.1 cutoff (orange) of the background distribution. (E) Sliding window analysis of CLRs with RI_RNA_S marked by a red rectangle. The result was generated using SARS-CoV-2 strains collected in April. Red triangles denote the two CLR peaks surrounding RI_RNA_S. The two peaks are significant or weakly significant if using the region nearby (from 21,000 to 25,000 bp) as a background, whose top 0.05 cutoff is denoted by a red dashed line and top 0.1 cutoff is denoted by an orange dashed line.
Figure 3
Figure 3
Evidence showing that bats may be a pool of genetic diversity. (A) Comparison of Pi in the nearby region of RI_RNA_S for coronaviruses from 7 different hosts, such as bat, human and pangolin. Pi values were calculated through a sliding window approach in the region from 21,500 to 25,000 bp according to MN908947. (B) Numbers of subclades of coronavirus in different hosts. (C) ie chart showing the numbers of independent recombination events in different hosts. Bat harbored the highest number and is marked in red. (D) Heatmap showing the numbers of independent recombination events occurring in coronaviruses between pairs of hosts the x and y axes). We did not consider recombination events between coronaviruses from the same host, which are marked by black squares.

References

    1. Lu, H., Stratton, C. W. & Tang, Y. W. Outbreak of pneumonia of unknown etiology in Wuhan China: the mystery and the miracle. J. Med. Virol. (2020). - PMC - PubMed
    1. Hui DS, et al. The continuing 2019-nCoV epidemic threat of novel coronaviruses to global health: the latest 2019 novel coronavirus outbreak in Wuhan China. Int. J. Infect. Dis. 2020;91:264–266. - PMC - PubMed
    1. Wu F, et al. A new coronavirus associated with human respiratory disease in China. Nature. 2020;579:265–269. - PMC - PubMed
    1. Guan Y, et al. Isolation and characterization of viruses related to the SARS coronavirus from animals in southern China. Science. 2003;302:276–278. - PubMed
    1. Azhar EI, et al. Evidence for camel-to-human transmission of MERS coronavirus. N. Engl. J. Med. 2014;370:2499–2505. - PubMed

Publication types