Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
[Preprint]. 2021 Apr 6:2020.02.10.942748.
doi: 10.1101/2020.02.10.942748.

Recombination and lineage-specific mutations linked to the emergence of SARS-CoV-2

Affiliations

Recombination and lineage-specific mutations linked to the emergence of SARS-CoV-2

Juan Ángel Patiño-Galindo et al. bioRxiv. .

Update in

Abstract

The emergence of SARS-CoV-2 underscores the need to better understand the evolutionary processes that drive the emergence and adaptation of zoonotic viruses in humans. In the betacoronavirus genus, which also includes SARS-CoV and MERS-CoV, recombination frequently encompasses the Receptor Binding Domain (RBD) of the Spike protein, which, in turn, is responsible for viral binding to host cell receptors. Here, we find evidence of a recombination event in the RBD involving ancestral linages to both SARS-CoV and SARS-CoV-2. Although we cannot specify the recombinant nor the parental strains, likely due to the ancestry of the event and potential undersampling, our statistical analyses in the space of phylogenetic trees support such an ancestral recombination. Consequently, SARS-CoV and SARS-CoV-2 share an RBD sequence that includes two insertions (positions 432-436 and 460-472), as well as the variants 427N and 436Y. Both 427N and 436Y belong to a helix that interacts directly with the human ACE2 (hACE2) receptor. Reconstruction of ancestral states, combined with protein-binding affinity analyses using the physics-based trRosetta algorithm, reveal that the recombination event involving ancestral strains of SARS-CoV and SARS-CoV-2 led to an increased affinity for hACE2 binding, and that alleles 427N and 436Y significantly enhanced affinity as well. Structural modeling indicates that ancestors of SARS-CoV-2 may have acquired the ability to infect humans decades ago. The binding affinity with the human receptor was subsequently boosted in SARS-CoV and SARS-CoV-2 through further mutations in RBD. In sum, we report an ancestral recombination event affecting the RBD of both SARS-CoV and SARS-CoV-2 that was associated with an increased binding affinity to hACE2.

PubMed Disclaimer

Conflict of interest statement

Disclosure of Potential Conflicts of Interest R.R. is a member of the SAB of AimedBio in a project unrelated to the current manuscript. PKS is a member of the SAB or Board of Directors of Applied Biomath LLC, Glencoe Software Inc, and RareCyte Inc and has equity in these companies; his is also on the SAB of NanoString Inc. In the last five years the Sorger lab has received research funding from Novartis and Merck. Sorger declares that none of these relationships are directly or indirectly related to the content of this manuscript. The other authors declare no conflicts.

Figures

Fig. 1 |
Fig. 1 |. Recombination analysis of betacoronaviruses.
a. Distribution of 103 inferred recombination events among human and non-human beta-CoV isolates showing the span of each recombinant region along the viral genome with respect to SARS-CoV coordinates. The spike protein and its RBD are highlighted. b. Sliding window analysis shows (blue curve) the distribution of recombination breakpoints (either start or end) in 800 nucleotide (nt) length windows upstream (namely, in the 5’ to 3’ direction) of every nt position along the viral genome. The spike protein, and in particular the RBD and its immediate downstream region, are significantly enriched in recombination breakpoints in betacoronaviruses. Benjamini-Yekutieli (BY) corrected p-values are shown (red curve), and the 5% BY FDR is shown for reference (dotted line).
Fig. 2 |
Fig. 2 |. Recombination analysis in MERS coronaviruses.
a. Distribution of 24 recombination events among human and non-human MERS-CoV isolates. The spike protein and its RBD are highlighted. b. Sliding window analysis shows (blue curve) the distribution of recombination breakpoints (either start or end) in 800 nucleotide (nt) length windows upstream (namely, in the 5’ to 3’ direction) of every nt position along the viral genome. The spike protein, and the RBD in particular, overlap with widows that are enriched in recombination breakpoints. Binomial test p-values (red curve) and the 5% significance level are shown (dotted line). The MERS-CoV membrane protein is highlighted (dark gray); it also shows an enrichment of recombination breakpoints.
Fig. 3 |
Fig. 3 |. Tracing the evolution of RBD in Sarbecoviruses.
Ancestral reconstruction analyses were performed using the maximum likelihood phylogenetic tree derived from RBD recombination event as input. a) Track of the evolutionary changes that occurred in the RBD from human-infecting Sarbecoviruses and their closest relatives. Black circles in the ML tree represent nodes with Shimodaira-Hasegawa-like support higher than 0.80. b) Distribution of Likelihood values associated with the most likely amino acid inferred for the different MRCAs in the RBD ancestral reconstruction analysis. Red dots represent positions highlighted in our work (Spike 333, 359, 427, 436, 432–435, 460–472, 484, 505).
Fig. 4 |
Fig. 4 |. Evolutionary events preceding the emergence of SARS-CoV-2, and functional impact of amino acids 427N and 436Y.
a. Phylogenetic representation summarizing the evolutionary events that likely led to the emergence of SARS-CoV-2: hit 1) Recombination of the RBD of the Spike protein involving lineages ancestral to SARS-CoV-2, RaTG13 and pangolin sequences (red cross) and SARS-CoV (blue cross); hit 2) SARS-CoV-2 accumulated four nonsynonymous mutations in RBD since its divergence from the MRCA that it shares with RaTG13 and the pangolin CoVs. b. Sliding window analysis (length 267 aa) identifies specific regions of SARS-CoV-2 with high divergence from the RaTG13 bat virus in the RBD of Spike (including 427N and 436Y), as well as in the Ubl1, HRV and SUD domains of nsp3 (non-structural protein 3) within the orf1a polyprotein. c. Functional impact of amino acid 427N in the SARS-CoV-2 Spike protein. Interaction between the human ACE2 receptor (green) and the spike protein (pink) based on SARS-CoV-2 (PDB accession code: 6LZG), highlighting the short helix 427–436 that lies at the interface of the Spike-ACE2 interaction. Dashed lines indicate hydrogen bonds between residues. The configuration shown with higher transparency is that of RaTG13 RBD interaction. d. Functional impact of amino acid 436Y.
Fig. 5 |
Fig. 5 |. The ancestral recombination event at RBD involving SARS-CoV/SARS-CoV-2 is associated with increased binding affinity to hACE2.
Boxplots represent the distribution of the binding energies of the RBD of each viral strain to hACE2, as inferred by Rosetta. Viral strains (and the analyzed MRCAs) have been labelled with numbers in a hierarchical order, as follows. Outgroup sequence: 0, MRCA from which the MRCA of the recombination cluster derives: 1, MRCA of the recombination event: 2, MRCA of SARS-CoV and SARS-CoV-2 lineages: 1, MRCA of SARS and its bat-SL-CoV relatives: 4A, SARS-CoV: 4B, MRCA of RaTG13, Pangolin-CoV and SARS-CoV-2: 3A, Pangolin-CoV: 3B-P, RaTG13: 3B-R, SARS-CoV-2: 3B-S. The diagram on the right summarizes the progressive increase in binding affinity along the evolutionary trajectories leading to SARS-CoV and SARS-CoV-2. All strains involved in the SARS/SARS-CoV-2 recombination event, including their MRCA, exhibit higher binding affinity (lower binding energy) than the bat SARS-like CoV used as outgroup (MG772933, ‘0’). Binding affinity increased further along the evolution of human-infecting Sarbecoviruses (SARS-CoV, SARS-CoV-2). The highest binding affinity among all strains analyzed is found in SARS-CoV-2 (‘3B-S’) and its MRCA shared with Pangolin-CoV and RaTG13 (‘3A’).
Fig. 6 |
Fig. 6 |. The effects of specific alleles on RBD binding affinity to hACE2.
a. Change in the binding energy of SARS-CoV-2 RBD to hACE2 caused by the reverse mutation of each SARS-CoV-2 lineage-specific allele to its ancestral state. Binding energy was assessed by considering each mutation individually as well as all possible combinations among the four different SARS-CoV-2 lineage-specific amino acids. b. Binding energy of RaTG13 RBD to hACE2 after mutating positions 427 and 436 (either individually or both together) to the SARS-CoV/SARS-CoV-2 alleles.

References

    1. W.H.O. Coronavirus disease 2019 (COVID-19) Situation Report - 5 January. (2021).
    1. Wu F. et al. A new coronavirus associated with human respiratory disease in China. Nature, doi:10.1038/s41586-020-2008-3 (2020). - DOI - PMC - PubMed
    1. Drosten C. et al. Identification of a novel coronavirus in patients with severe acute respiratory syndrome. N Engl J Med 348, 1967–1976, doi:10.1056/NEJMoa030747 (2003). - DOI - PubMed
    1. Banerjee A., Kulcsar K., Misra V., Frieman M. & Mossman K. Bats and Coronaviruses. Viruses 11, doi:10.3390/v11010041 (2019). - DOI - PMC - PubMed
    1. Peng Zhou, X.-L. Y., Wang Xian-Guang, Hu Ben, Zhang Lei, Zhang Wei, Si Hao-Rui, Zhu Yan, Li Bei, Huang Chao-Lin, Chen Hui-Dong, Chen Jing, Luo Yun, Guo Hua, Jiang Ren-Di, Liu Mei-Qin, Chen Ying, Shen Xu-Rui, Wang Xi, Zheng Xiao-Shuang, Zhao Kai, Chen Quan-Jiao, Deng Fei, Liu Lin-Lin, Yan Bing, Zhan Fa-Xian, Wang Yan-Yi, Xiao Gengfu, Shi Zheng-Li. Discovery of a novel coronavirus associated with the recent pneumonia outbreak in humans and its potential bat origin. bioRxiv 2020.01.22.914952 doi: 10.1101/2020.01.22.914952 (2020). - DOI

Publication types