Fitness effects of mutations to SARS-CoV-2 proteins

doi:10.1093/ve/vead055

. 2023 Sep 18;9(2):vead055.

doi: 10.1093/ve/vead055. eCollection 2023.

Fitness effects of mutations to SARS-CoV-2 proteins

Jesse D Bloom^{1

2

3}, Richard A Neher^{4

5}

Affiliations

¹ Basic Sciences and Computational Biology, Fred Hutchinson Cancer Center, 1100 Fairview Ave N, Seattle, WA 98109, USA.
² Department of Genome Sciences, University of Washington, 3720 15th Ave NE, Seattle, WA 98195, USA.
³ Howard Hughes Medical Institute, 1100 Fairview Ave N, Seattle, WA 98109,USA.
⁴ Biozentrum, University of Basel, Spitalstrasse 41, Basel 4056, Switzerland.
⁵ Swiss Institute of Bioinformatics, Lausanne 1015, Switzerl.

PMID: 37727875
PMCID: PMC10506532
DOI: 10.1093/ve/vead055

Fitness effects of mutations to SARS-CoV-2 proteins

Jesse D Bloom et al. Virus Evol. 2023.

. 2023 Sep 18;9(2):vead055.

doi: 10.1093/ve/vead055. eCollection 2023.

Authors

Jesse D Bloom^{1

2

3}, Richard A Neher^{4

5}

Affiliations

¹ Basic Sciences and Computational Biology, Fred Hutchinson Cancer Center, 1100 Fairview Ave N, Seattle, WA 98109, USA.
² Department of Genome Sciences, University of Washington, 3720 15th Ave NE, Seattle, WA 98195, USA.
³ Howard Hughes Medical Institute, 1100 Fairview Ave N, Seattle, WA 98109,USA.
⁴ Biozentrum, University of Basel, Spitalstrasse 41, Basel 4056, Switzerland.
⁵ Swiss Institute of Bioinformatics, Lausanne 1015, Switzerl.

PMID: 37727875
PMCID: PMC10506532
DOI: 10.1093/ve/vead055

Erratum in

Correction to: Fitness effects of mutations to SARS-CoV-2 proteins.
[No authors listed] [No authors listed] Virus Evol. 2024 Mar 26;10(1):veae026. doi: 10.1093/ve/veae026. eCollection 2024. Virus Evol. 2024. PMID: 38577658 Free PMC article.

Abstract

Knowledge of the fitness effects of mutations to SARS-CoV-2 can inform assessment of new variants, design of therapeutics resistant to escape, and understanding of the functions of viral proteins. However, experimentally measuring effects of mutations is challenging: we lack tractable lab assays for many SARS-CoV-2 proteins, and comprehensive deep mutational scanning has been applied to only two SARS-CoV-2 proteins. Here, we develop an approach that leverages millions of publicly available SARS-CoV-2 sequences to estimate effects of mutations. We first calculate how many independent occurrences of each mutation are expected to be observed along the SARS-CoV-2 phylogeny in the absence of selection. We then compare these expected observations to the actual observations to estimate the effect of each mutation. These estimates correlate well with deep mutational scanning measurements. For most genes, synonymous mutations are nearly neutral, stop-codon mutations are deleterious, and amino acid mutations have a range of effects. However, some viral accessory proteins are under little to no selection. We provide interactive visualizations of effects of mutations to all SARS-CoV-2 proteins (https://jbloomlab.github.io/SARS2-mut-fitness/). The framework we describe is applicable to any virus for which the number of available sequences is sufficiently large that many independent occurrences of each neutral mutation are observed.

Keywords: COVID-19; UShER; dN/dS; deep mutational scanning; fitness; mutation rate.

PubMed Disclaimer

Conflict of interest statement

J.D.B. consults Apriori Bio, Aerium Therapeutics, Invivyd, the Vaccine Company, GSK, and Pfizer on topics related to viral evolution. J.D.B. receives royalty payments as an inventor on Fred Hutch licensed patents related to deep mutational scanning of viral proteins.

Figures

**Figure 1.**
Expected versus actual counts of mutations. (A) The number of expected counts of each type of nucleotide mutation is computed from four-fold degenerate sites, and then compared the actual counts of each mutation. (B) Expected versus actual counts for each nucleotide mutation type aggregated across all viral clades and averaged across all sites where the mutation is four-fold degenerate, synonymous (including four-fold degenerate), nonsynonymous, or introduces a stop codon. See https://jbloomlab.github.io/SARS2-mut-fitness/avg_counts.html for an interactive version of panel B that enables mouseovers to read off specific values.

**Figure 2.**
Correlations of mutation fitness effect estimates made using subsets of natural sequences. Correlations between estimates made (A) just using sequences from the Delta or Omicron BA.5 clades or (B) just from the USA or England. Each point is an amino acid mutation, the orange line is a least-squares regression, and orange text at upper left shows the number of mutations and Pearson’s correlation coefficient. Only mutations with at least 10 expected counts are shown, which is why panels have different numbers of mutations shown (sequence subsets vary in size). Different subset size are also the reason why the regression line in (A) deviates from the identity x = y. (C) Correlations between clade or geography subsets become higher with an increasingly large threshold for minimum expected counts. Spike mutations have a worse correlation when subsetting by viral clade (plot shows average correlation over all pairwise combinations of Delta, BA.1, BA.2, and BA.5), but not when subsetting by geography (USA or England). (D) Correlations in estimated mutation-effects decline for clades with higher protein divergence, with the effect most noticeable for spike since spike is more diverged among SARS-CoV-2 clades than other viral proteins. See https://jbloomlab.github.io/SARS2-mut-fitness/clade_corr_chart.html and https://jbloomlab.github.io/SARS2-mut-fitness/subset_corr_chart.html for versions of A and B that include all viral clades with at least 500,000 total expected counts (summed across all mutations) and have other interactive options.

**Figure 3.**
Distribution of effects of different classes of mutations. (A) Histograms of effects of synonymous, nonsynonymous, and stop-codon mutations across all viral genes. Neutral mutations have effects of zero (dashed gray vertical lines), and deleterious mutations have negative effects. (B) Effects of each class of mutation for each viral gene. Dark squares indicate the median effect, and the lighter rectangles span the interquartile range. Mutation types are color-coded as in panel (A). The apparent constraint on synonymous mutations in ORF9b is probably because this gene is encoded in an overlapping reading frame with N (Jungreis et al. 2021). See https://jbloomlab.github.io/SARS2-mut-fitness/effects_histogram.html and https://jbloomlab.github.io/SARS2-mut-fitness/effects_dist.html for plots that allow adjustment of the expected-count cutoff and other interactive options (such as separate histograms for each gene). See Supplementary Fig. S3 for a version of panel B with genes ordered by genomic position rather than constraint on nonsynonymous mutations.

**Figure 4.**
Correlation of mutation-effect estimates with experimental deep mutational scanning measurements for (A) the full spike (Dadonaite et al. 2023) or its RBD (Starr et al. 2022b), and (B) Mpro (Flynn et al. 2023; Iketani et al. 2022a). Each point is an amino acid mutation, the orange line is a least-squares regression, and orange text in the upper left shows the number of mutations and Pearson’s correlation coefficient. Each subpanel shows a different set of mutations (depending on which mutations were measured in that experiment). See https://jbloomlab.github.io/SARS2-mut-fitness/dms_S_corr.html and https://jbloomlab.github.io/SARS2-mut-fitness/dms_nsp5_corr.html for plots that also show the Mpro dataset from (Flynn et al. 2022) and have various interactive options. The plots in this figure show the average of the multiple phenotypes measured in the deep mutational scanning of Starr et al. (2022b); see https://jbloomlab.github.io/SARS2-mut-fitness/dms_S_all_corr.html for each phenotype separately. This figure only shows mutations with at least 20 expected counts, which is higher than the threshold of 10 used in most of the rest of this paper (this threshold can be adjusted in the interactive plots).

**Figure 5.**
Effects of amino acid mutations to E protein. The area plot at top shows the average effects of mutations at each site, and the heatmap shows the effects of specific amino acids, with x denoting the amino acid identity in the Wuhan-Hu-1 strain. See https://jbloomlab.github.io/SARS2-mut-fitness/E.html for an interactive version of this plot that enables zooming, mouseovers, adjustment of the minimum expected count threshold, and layering of stop codon effects on the site plot. See https://jbloomlab.github.io/SARS2-mut-fitness for comparable interactive plots for all SARS-CoV-2 proteins.

See this image and copyright information in PMC

Update of

Fitness effects of mutations to SARS-CoV-2 proteins.
Bloom JD, Neher RA. Bloom JD, et al. bioRxiv [Preprint]. 2023 Jun 6:2023.01.30.526314. doi: 10.1101/2023.01.30.526314. bioRxiv. 2023. Update in: Virus Evol. 2023 Sep 18;9(2):vead055. doi: 10.1093/ve/vead055. PMID: 36778462 Free PMC article. Updated. Preprint.

Cited by

Full-spike deep mutational scanning helps predict the evolutionary success of SARS-CoV-2 clades.
Dadonaite B, Brown J, McMahon TE, Farrell AG, Asarnow D, Stewart C, Logue J, Murrell B, Chu HY, Veesler D, Bloom JD. Dadonaite B, et al. bioRxiv [Preprint]. 2023 Nov 14:2023.11.13.566961. doi: 10.1101/2023.11.13.566961. bioRxiv. 2023. Update in: Nature. 2024 Jul;631(8021):617-626. doi: 10.1038/s41586-024-07636-1. PMID: 38014024 Free PMC article. Updated. Preprint.
Spike deep mutational scanning helps predict success of SARS-CoV-2 clades.
Dadonaite B, Brown J, McMahon TE, Farrell AG, Figgins MD, Asarnow D, Stewart C, Lee J, Logue J, Bedford T, Murrell B, Chu HY, Veesler D, Bloom JD. Dadonaite B, et al. Nature. 2024 Jul;631(8021):617-626. doi: 10.1038/s41586-024-07636-1. Epub 2024 Jul 3. Nature. 2024. PMID: 38961298 Free PMC article.
Adsorption-Driven Deformation and Footprints of the RBD Proteins in SARS-CoV-2 Variants on Biological and Inanimate Surfaces.
Bosch AM, Guzman HV, Pérez R. Bosch AM, et al. J Chem Inf Model. 2024 Aug 12;64(15):5977-5990. doi: 10.1021/acs.jcim.4c00460. Epub 2024 Jul 31. J Chem Inf Model. 2024. PMID: 39083670 Free PMC article.
Balancing stability and function: impact of the surface charge of SARS-CoV-2 Omicron spike protein.
Lauster D, Haag R, Ballauff M, Herrmann A. Lauster D, et al. Npj Viruses. 2025 Apr 1;3(1):23. doi: 10.1038/s44298-025-00104-1. Npj Viruses. 2025. PMID: 40295844 Free PMC article. Review.
Phylogenetic signatures reveal multilevel selection and fitness costs in SARS-CoV-2.
Bonetti Franceschi V, Volz E. Bonetti Franceschi V, et al. Wellcome Open Res. 2024 Jul 24;9:85. doi: 10.12688/wellcomeopenres.20704.2. eCollection 2024. Wellcome Open Res. 2024. PMID: 39132669 Free PMC article.

See all "Cited by" articles

References

1. Abdool Karim S. S. and T. de Oliveira (2021) ‘New SARS-CoV-2 variants—clinical, public health, and vaccine implications’, New England Journal of Medicine, 384: 1866–1868. - PMC - PubMed
1. Acevedo A., L. Brodsky and R. Andino (2014) ‘Mutational and fitness landscapes of an RNA virus revealed through population sequencing’, Nature, 505: 686–690. - PMC - PubMed
1. Aksamentov I. et al. (2021) ‘Nextclade: clade assignment, mutation calling and quality control for viral genomes’, Journal of Open Source Software, 6: 3773.
1. Beale R. C. et al. (2004) ‘Comparison of the differential context-dependence of DNA deamination by APOBEC enzymes: correlation with mutation spectra in vivo’, Journal of Molecular Biology, 337: 585–596. - PubMed
1. Bhatt P. R. et al. (2021) ‘Structural basis of ribosomal frameshifting during translation of the SARS-CoV-2 RNA genome’, Science, 372: 1306–1313. - PMC - PubMed

Grants and funding

LinkOut - more resources

Full Text Sources
Miscellaneous
- NCI CPTAC Assay Portal

[1] Abdool Karim S. S. and T. de Oliveira (2021) ‘New SARS-CoV-2 variants—clinical, public health, and vaccine implications’, New England Journal of Medicine, 384: 1866–1868. - PMC - PubMed

[2] Abdool Karim S. S. and T. de Oliveira (2021) ‘New SARS-CoV-2 variants—clinical, public health, and vaccine implications’, New England Journal of Medicine, 384: 1866–1868. - PMC - PubMed

[3] Acevedo A., L. Brodsky and R. Andino (2014) ‘Mutational and fitness landscapes of an RNA virus revealed through population sequencing’, Nature, 505: 686–690. - PMC - PubMed

[4] Acevedo A., L. Brodsky and R. Andino (2014) ‘Mutational and fitness landscapes of an RNA virus revealed through population sequencing’, Nature, 505: 686–690. - PMC - PubMed

[5] Aksamentov I. et al. (2021) ‘Nextclade: clade assignment, mutation calling and quality control for viral genomes’, Journal of Open Source Software, 6: 3773.

[6] Aksamentov I. et al. (2021) ‘Nextclade: clade assignment, mutation calling and quality control for viral genomes’, Journal of Open Source Software, 6: 3773.

[7] Beale R. C. et al. (2004) ‘Comparison of the differential context-dependence of DNA deamination by APOBEC enzymes: correlation with mutation spectra in vivo’, Journal of Molecular Biology, 337: 585–596. - PubMed

[8] Beale R. C. et al. (2004) ‘Comparison of the differential context-dependence of DNA deamination by APOBEC enzymes: correlation with mutation spectra in vivo’, Journal of Molecular Biology, 337: 585–596. - PubMed

[9] Bhatt P. R. et al. (2021) ‘Structural basis of ribosomal frameshifting during translation of the SARS-CoV-2 RNA genome’, Science, 372: 1306–1313. - PMC - PubMed

[10] Bhatt P. R. et al. (2021) ‘Structural basis of ribosomal frameshifting during translation of the SARS-CoV-2 RNA genome’, Science, 372: 1306–1313. - PMC - PubMed

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Fitness effects of mutations to SARS-CoV-2 proteins

Affiliations

Fitness effects of mutations to SARS-CoV-2 proteins

Authors

Affiliations

Erratum in

Abstract

Conflict of interest statement

Figures

Update of

Similar articles

Cited by

References

Grants and funding

LinkOut - more resources

Full Text Sources

Miscellaneous

Erratum in

Abstract

Conflict of interest statement

Figures

Update of

Similar articles

Cited by

References

Related information

Grants and funding

LinkOut - more resources

Full Text Sources

Miscellaneous