Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2020 Oct 16:8:e10234.
doi: 10.7717/peerj.10234. eCollection 2020.

Positive selection within the genomes of SARS-CoV-2 and other Coronaviruses independent of impact on protein function

Affiliations

Positive selection within the genomes of SARS-CoV-2 and other Coronaviruses independent of impact on protein function

Alejandro Berrio et al. PeerJ. .

Abstract

Background: The emergence of a novel coronavirus (SARS-CoV-2) associated with severe acute respiratory disease (COVID-19) has prompted efforts to understand the genetic basis for its unique characteristics and its jump from non-primate hosts to humans. Tests for positive selection can identify apparently nonrandom patterns of mutation accumulation within genomes, highlighting regions where molecular function may have changed during the origin of a species. Several recent studies of the SARS-CoV-2 genome have identified signals of conservation and positive selection within the gene encoding Spike protein based on the ratio of synonymous to nonsynonymous substitution. Such tests cannot, however, detect changes in the function of RNA molecules.

Methods: Here we apply a test for branch-specific oversubstitution of mutations within narrow windows of the genome without reference to the genetic code.

Results: We recapitulate the finding that the gene encoding Spike protein has been a target of both purifying and positive selection. In addition, we find other likely targets of positive selection within the genome of SARS-CoV-2, specifically within the genes encoding Nsp4 and Nsp16. Homology-directed modeling indicates no change in either Nsp4 or Nsp16 protein structure relative to the most recent common ancestor. These SARS-CoV-2-specific mutations may affect molecular processes mediated by the positive or negative RNA molecules, including transcription, translation, RNA stability, and evasion of the host innate immune system. Our results highlight the importance of considering mutations in viral genomes not only from the perspective of their impact on protein structure, but also how they may impact other molecular processes critical to the viral life cycle.

Keywords: COVID-19; Evolution; Genome; Positive selection; SARS-CoV-2.

PubMed Disclaimer

Conflict of interest statement

The authors declare that they have no competing interests.

Figures

Figure 1
Figure 1. Distribution of evolutionary rate and positive selection across multiple species of coronaviruses of the Sarbecovirus subgenus.
(A) Distribution of the evolutionary ratio, ζ, along multiple viral genome alignments. Red dots imply significant values of zeta from the adaptiPhy test, black dots represent neutral evolution or purifying selection in the foreground branch. (B) Visualization of selection within Spike protein among species. Dark red symbolizes windows where ζ is higher than 4, red is a significant ζ and and gray indicates neutral or purifying selection. RBD, receptor binding domain. Tertiary structure of Spike protein depicting the location positive selection and amino acid substitutions in (C) SARS-CoV-2 and (D) SARS-CoV.
Figure 2
Figure 2. Distribution of positive selection, PhastCons conservation, and polymorphic variation across the SARS-CoV-2 genome.
(A) Evolutionary rate (ζ) with sites under significant branch specific selection as red dots. (B) Panel depicting conservation values (PhastCons) with highly conserved windows (PhastCons >0.9) as blue dots over the dashed line along the SARS-CoV-2 genome. (C) Alternative allele frequency for 5,000 high quality genomes available in NCBI. Dotted line represents an arbitrary threshold of 0.6 and SNPs in strong linkage disequilibrium are highlighted with arrowheads under black bars. (D) Allele density in windows of 500 bp with a step of 150 bp. Red boxes in B and C symbolize regions under positive selection, while blue boxes represent high conservation. (E) Annotations for all the mature proteins known to be expressed in SARS-CoV-2.
Figure 3
Figure 3. RNA and Protein sequences of Spike, Nsp4 and Nsp16.
Each panel (A–C) shows selected RNA and protein sequences scoring high for positive selection in the SARS-CoV-2 branch and other branches (highlighted in red). Changes with respect to SARS-CoV-2 are highlighted in different colors.
Figure 4
Figure 4. Regions of coronavirus genomes that violate the species tree.
The species tree topology is shown on the left. (A–L) Tree topologies that were different from the expected topology. (M) Coronavirus genome track where the regions scoring high for positive selection in SARS-CoV-2 are highlighted in orange, regions with unexpected tree topologies highlighted in dark gray.

References

    1. Alhatlani BY. In silico identification of conserved cis-acting RNA elements in the SARS-CoV-2 genome. Future Virology. 2020;15(7):409–417. doi: 10.2217/fvl-2020-0163. - DOI - PMC - PubMed
    1. Andersen KG, Rambaut A, Lipkin WI, Holmes EC, Garry RF. The proximal origin of SARS-CoV-2. Nature Medicine. 2020;26(4):450–452. doi: 10.1038/s41591-020-0820-9. - DOI - PMC - PubMed
    1. Andrews RJ, Peterson JM, Haniff HS, Chen J, Williams C, Grefe M, Disney MD, Moss WN. An in silico map of the SARS-CoV-2 RNA structurome. BioRxiv. 2020 doi: 10.1101/2020.04.17.045161. - DOI - PMC - PubMed
    1. Armijos-Jaramillo V, Yeager J, Muslin C, Perez-Castillo Y. SARS-CoV-2, an evolutionary perspective of interaction with human ACE2 reveals undiscovered amino acids necessary for complex stability. Evolutionary Applications. 2020;13(9):2168–2178. doi: 10.1111/eva.12980. - DOI - PMC - PubMed
    1. Báez-Santos YM, St. John SE, Mesecar AD. The SARS-coronavirus papain-like protease: structure, function and inhibition by designed antiviral compounds. Antiviral Research. 2015;115:21–38. doi: 10.1016/j.antiviral.2014.12.015. - DOI - PMC - PubMed

LinkOut - more resources