Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2021 Aug 6;13(1):124.
doi: 10.1186/s13073-021-00943-6.

Recombination and lineage-specific mutations linked to the emergence of SARS-CoV-2

Affiliations

Recombination and lineage-specific mutations linked to the emergence of SARS-CoV-2

Juan Ángel Patiño-Galindo et al. Genome Med. .

Abstract

Background: The emergence of SARS-CoV-2 underscores the need to better understand the evolutionary processes that drive the emergence and adaptation of zoonotic viruses in humans. In the betacoronavirus genus, which also includes SARS-CoV and MERS-CoV, recombination frequently encompasses the receptor binding domain (RBD) of the Spike protein, which is responsible for viral binding to host cell receptors. In this work, we reconstruct the evolutionary events that have accompanied the emergence of SARS-CoV-2, with a special emphasis on the RBD and its adaptation for binding to its receptor, human ACE2.

Methods: By means of phylogenetic and recombination analyses, we found evidence of a recombination event in the RBD involving ancestral linages to both SARS-CoV and SARS-CoV-2. We then assessed the effect of this recombination at protein level by reconstructing the RBD of the closest ancestors to SARS-CoV-2, SARS-CoV, and other Sarbecoviruses, including the most recent common ancestor of the recombining clade. The resulting information was used to measure and compare, in silico, their ACE2-binding affinities using the physics-based trRosetta algorithm.

Results: We show that, through an ancestral recombination event, SARS-CoV and SARS-CoV-2 share an RBD sequence that includes two insertions (positions 432-436 and 460-472), as well as the variants 427N and 436Y. Both 427N and 436Y belong to a helix that interacts directly with the human ACE2 (hACE2) receptor. Reconstruction of ancestral states, combined with protein-binding affinity analyses, suggests that the recombination event involving ancestral strains of SARS-CoV and SARS-CoV-2 led to an increased affinity for hACE2 binding and that alleles 427N and 436Y significantly enhanced affinity as well.

Conclusions: We report an ancestral recombination event affecting the RBD of both SARS-CoV and SARS-CoV-2 that was associated with an increased binding affinity to hACE2. Structural modeling indicates that ancestors of SARS-CoV-2 may have acquired the ability to infect humans decades ago. The binding affinity with the human receptor would have been subsequently boosted in SARS-CoV and SARS-CoV-2 through further mutations in RBD.

Keywords: Receptor binding affinity; Recombination; SARS-CoV-2; Zoonosis.

PubMed Disclaimer

Conflict of interest statement

R.R. is a member of the SAB of AimedBio, consults for Arquimea Research and is founder of Genotwin. PKS is a member of the SAB or Board of Directors of Applied Biomath LLC, Glencoe Software Inc., and RareCyte Inc. and has equity in these companies; his is also on the SAB of NanoString Inc. In the last 5 years, the Sorger lab has received research funding from Novartis and Merck. Sorger declares that none of these relationships are directly or indirectly related to the content of this manuscript. The remaining authors declare that they have no competing interests.

Figures

Fig. 1
Fig. 1
Recombination analysis of betacoronaviruses. a Distribution of 103 inferred recombination events among human and non-human beta-CoV isolates showing the span of each recombinant region along the viral genome with respect to SARS-CoV coordinates. The spike protein and its RBD are highlighted. b Sliding window analysis shows (blue curve) the distribution of recombination breakpoints (either start or end) in 800 nucleotide (nt) length windows upstream (namely, in the 5′ to 3′ direction) of every nt position along the viral genome. The spike protein and in particular the RBD and its immediate downstream region are significantly enriched in recombination breakpoints in betacoronaviruses. Benjamini-Yekutieli (BY) corrected p values are shown (red curve), and the 5% BY FDR is shown for reference (dotted line)
Fig. 2
Fig. 2
Recombination analysis in MERS coronaviruses. a Distribution of 24 recombination events among human and non-human MERS-CoV isolates. The spike protein and its RBD are highlighted. b Sliding window analysis shows (blue curve) the distribution of recombination breakpoints (either start or end) in 800 nucleotide (nt) length windows upstream (namely, in the 5′ to 3′ direction) of every nt position along the viral genome. The spike protein, and the RBD in particular, overlap with widows that are enriched in recombination breakpoints. Binomial test p values (red curve) and the 5% significance level are shown (dotted line). The MERS-CoV membrane protein is highlighted (dark gray); it also shows an enrichment of recombination breakpoints
Fig. 3
Fig. 3
Tracing the evolution of RBD in Sarbecoviruses. a Tanglegram displaying the differences between the trees derived from the RBD recombination segment and that from the rest of the genome. Sequence names were colored according to the host they were sampled from. Circles represent Shimodaira-Hasegawa-like supports higher than 0.80. b Track of the evolutionary changes that occurred in the RBD from human-infecting Sarbecoviruses and their closest relatives. The ancestral reconstruction analyses were performed using the maximum likelihood phylogenetic tree derived from RBD recombination event as input. Black circles in the ML tree represent nodes with Shimodaira-Hasegawa-like support higher than 0.80
Fig. 4
Fig. 4
Evolutionary events preceding the SARS-CoV-2 emergence and functional impact of amino acids 427N and 436Y. a Phylogenetic representation summarizing the evolutionary events that likely led to the emergence of SARS-CoV-2: hit [1] recombination of the RBD of the Spike protein involving lineages ancestral to SARS-CoV-2, RaTG13, and pangolin sequences (red cross) and SARS-CoV (blue cross); hit [2] SARS-CoV-2 accumulated four nonsynonymous mutations in RBD since its divergence from the MRCA that it shares with RaTG13 and the pangolin CoVs. b Sliding window analysis (length 267 aa) identifies specific regions of SARS-CoV-2 with high divergence from the RaTG13 bat virus in the RBD of Spike (including 427N and 436Y), as well as in the Ubl1, HRV, and SUD domains of nsp3 (non-structural protein 3) within the orf1a polyprotein. c Functional impact of amino acid 427N in the SARS-CoV-2 Spike protein. Interaction between the human ACE2 receptor (green) and the spike protein (pink) based on SARS-CoV-2 (PDB accession code: 6LZG), highlighting the short helix 427-436 that lies at the interface of the Spike-ACE2 interaction. Dashed lines indicate hydrogen bonds between residues. The configuration shown with higher transparency is that of RaTG13 RBD interaction. d Functional impact of amino acid 436Y
Fig. 5
Fig. 5
The ancestral recombination event at RBD involving SARS-CoV/SARS-CoV-2 is associated with increased affinity to hACE2. Boxplots represent the distribution of the binding energies of the RBD of each viral strain to hACE2, as inferred by Rosetta. Viral strains (and the analyzed MRCAs) have been labeled with numbers in a hierarchical order, as follows. Outgroup sequence: 0, MRCA from which the MRCA of the recombination cluster derives: 1, MRCA of the recombination event: 2, MRCA of SARS-CoV and SARS-CoV-2 lineages: 1, MRCA of SARS and its bat-SL-CoV relatives: 4A, SARS-CoV: 4B, MRCA of RaTG13, Pangolin-CoV and SARS-CoV-2: 3A, Pangolin-CoV: 3B-P, RaTG13: 3B-R, SARS-CoV-2: 3B-S. The diagram on the right summarizes the progressive increase in binding affinity along the evolutionary trajectories leading to SARS-CoV and SARS-CoV-2. All strains involved in the SARS/SARS-CoV-2 recombination event, including their MRCA, exhibit higher binding affinity (lower binding energy) than the bat SARS-like CoV used as outgroup (MG772933, “0”). Binding affinity increased further along the evolution of human-infecting Sarbecoviruses (SARS-CoV, SARS-CoV-2). The highest binding affinity among all strains analyzed is found in SARS-CoV-2 (“3B-S”) and its MRCA shared with Pangolin-CoV and RaTG13 (“3A”)
Fig. 6
Fig. 6
The effects of specific alleles on RBD binding affinity to hACE2. a Change in the binding energy of SARS-CoV-2 RBD to hACE2 caused by the reverse mutation of each SARS-CoV-2 lineage-specific allele to its ancestral state. Binding energy was assessed by considering each mutation individually as well as all possible combinations among the four different SARS-CoV-2 lineage-specific amino acids. b Binding energy of RaTG13 RBD to hACE2 after mutating positions 427 and 436 (either individually or both together) to the SARS-CoV/SARS-CoV-2 alleles

Update of

References

    1. W.H.O . Coronavirus disease 2019 (COVID-19) Situation Report - 22 June. 2021.
    1. Wu F, Zhao S, Yu B, Chen YM, Wang W, Song ZG, et al. A new coronavirus associated with human respiratory disease in China. Nature. 2020. - PMC - PubMed
    1. Drosten C, Gunther S, Preiser W, van der Werf S, Brodt HR, Becker S, et al. Identification of a novel coronavirus in patients with severe acute respiratory syndrome. N Engl J Med. 2003;348(20):1967–1976. doi: 10.1056/NEJMoa030747. - DOI - PubMed
    1. Banerjee A, Kulcsar K, Misra V, Frieman M, Mossman K. Bats and coronaviruses. Viruses. 2019;11(1). - PMC - PubMed
    1. Peng Zhou X-LY, Wang X-G, Hu B, Zhang L, Zhang W, Si H-R, Zhu Y, Li B, Huang C-L, Chen H-D, Chen J, Luo Y, Guo H, Jiang R-D, Liu M-Q, Chen Y, Shen X-R, Wang X, Zheng X-S, Zhao K, Chen Q-J, Deng F, Liu L-L, Yan B, Zhan F-X, Wang Y-Y, Xiao G, Shi Z-L. A pneumonia outbreak associated with a new coronavirus of probable bat origin. Nature. 2020;579(579):270–273. doi: 10.1038/s41586-020-2012-7. - DOI - PMC - PubMed

Publication types

MeSH terms

Substances