Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2022 Jun 17;376(6599):1327-1332.
doi: 10.1126/science.abm1208. Epub 2022 May 24.

Analysis of 6.4 million SARS-CoV-2 genomes identifies mutations associated with fitness

Affiliations

Analysis of 6.4 million SARS-CoV-2 genomes identifies mutations associated with fitness

Fritz Obermeyer et al. Science. .

Abstract

Repeated emergence of severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) variants with increased fitness underscores the value of rapid detection and characterization of new lineages. We have developed PyR0, a hierarchical Bayesian multinomial logistic regression model that infers relative prevalence of all viral lineages across geographic regions, detects lineages increasing in prevalence, and identifies mutations relevant to fitness. Applying PyR0 to all publicly available SARS-CoV-2 genomes, we identify numerous substitutions that increase fitness, including previously identified spike mutations and many nonspike mutations within the nucleocapsid and nonstructural proteins. PyR0 forecasts growth of new lineages from their mutational profile, ranks the fitness of lineages as new sequences become available, and prioritizes mutations of biological and public health concern for functional characterization.

PubMed Disclaimer

Figures

Fig. 1.
Fig. 1.. Relative fitness versus date of lineage emergence.
Circle size is proportional to cumulative case count inferred from lineage proportion estimates and confirmed case counts. Inset table lists the 10 fittest lineages inferred by the model. R/RA is the fold increase in relative fitness over the Wuhan (A) lineage, assuming a fixed generation time of 5.5 days.
Fig. 2.
Fig. 2.. Manhattan plot of amino acid changes assessed in this study.
(A) Changes across the entire genome. (B) Changes in the first 850 amino acids of S. In each of (A) to (C) the y axis shows effect size Δ log R, the estimated change in log relative fitness due to each amino acid change. The bottom three axes show the background density of all observed amino acid changes, the density of those associated with growth (weighted by |Δ log R|), and the ratio of the two. The top 55 amino acid changes are labeled. See fig. S13 for detailed views of S, N, ORF1a, and ORF1b. C. Changes in the first 250 amino acids of N. (D) Structure of the spike-ACE2 complex (PDB: 7KNB). Spike subunits colored light blue, light orange, and gray. Top-ranked mutations are shown as red spheres. ACE2 is shown in magenta. (E) Close-up view of the RBD interface. (F) Top-ranked mutations in the N-terminal RNA-binding domain of N. Residues 44-180 of N (PDB: 7ACT) are shown in light blue. Amino acid positions corresponding to top mutations in this region are shown as red spheres. A 10-nt bound RNA is shown in gray.
Fig. 3.
Fig. 3.. (A) Infectivity relative to WT of lentiviral vectors pseudotyped with the indicated Spike mutants.
Target cells were HEK293T cells expressing ACE2 and TMPRSS2 transgenes. The genetic background of the Spike was Wuhan-Hu-1 bearing D614G. Red bars were significantly different from WT (adjusted p values shown). Black bars were not significantly different from WT. (B) For the 1701 SARS-CoV-2 clusters with at least one amino acid substitution in the RBD domain we compare: i) the PyR0 prediction for the contribution to Δ log R from RBD substitutions only; to ii) antibody binding computed using the antibody-escape calculator in ( 20 ). The escape calculator is based on an intuitive non-linear model parameterized using deep mutational scanning data for 33 neutralizing antibodies elicited by SARS-CoV-2. PyR0 predictions exhibit high (Spearman) correlation with predictions from Greaney et al. ( 20 ) (C to E) We dissect PyR0 Δ log R estimates into S-gene (C), RBD (D), and non-S-gene (E) contributions for 3000 SARS-CoV-2 clusters (blue dots). The horizontal axis corresponds to the date at which each cluster first emerged. Red squares denote the median Δ log R within each monthly bin. The increased importance of S-gene mutations (notably in the RBD) over non-S-gene mutations starting around November 2021 is apparent.

Update of

References

    1. Davies N. G., Abbott S., Barnard R. C., Jarvis C. I., Kucharski A. J., Munday J. D., Pearson C. A. B., Russell T. W., Tully D. C., Washburne A. D., Wenseleers T., Gimma A., Waites W., Wong K. L. M., van Zandvoort K., Silverman J. D., Diaz-Ordaz K., Keogh R., Eggo R. M., Funk S., Jit M., Atkins K. E., Edmunds W. J.; CMMID COVID-19 Working Group; COVID-19 Genomics UK (COG-UK) Consortium , Estimated transmissibility and impact of SARS-CoV-2 lineage B.1.1.7 in England. Science 372, eabg3055 (2021). 10.1126/science.abg3055 - DOI - PMC - PubMed
    1. Volz E., Mishra S., Chand M., Barrett J. C., Johnson R., Geidelberg L., Hinsley W. R., Laydon D. J., Dabrera G., O’Toole Á., Amato R., Ragonnet-Cronin M., Harrison I., Jackson B., Ariani C. V., Boyd O., Loman N. J., McCrone J. T., Gonçalves S., Jorgensen D., Myers R., Hill V., Jackson D. K., Gaythorpe K., Groves N., Sillitoe J., Kwiatkowski D. P., Flaxman S., Ratmann O., Bhatt S., Hopkins S., Gandy A., Rambaut A., Ferguson N. M.; COVID-19 Genomics UK (COG-UK) consortium , Assessing transmissibility of SARS-CoV-2 lineage B.1.1.7 in England. Nature 593, 266–269 (2021). - PubMed
    1. Stefanelli P., Trentini F., Guzzetta G., Marziano V., Mammone A., Poletti P., Grané C. M., Manica M., del Manso M., Andrianou X., Others, Co-circulation of SARS-CoV-2 variants B. 1.1. 7 and P. 1. medRxiv (2021) (available at https://www.medrxiv.org/content/10.1101/2021.04.06.21254923v1.abstract). - DOI
    1. Stefanelli P., Trentini F., Guzzetta G., Marziano V., Mammone A., Sane Schepisi M., Poletti P., Molina Grané C., Manica M., Del Manso M., Andrianou X., Ajelli M., Rezza G., Brusaferro S., Merler S.; COVID-19 National Microbiology Surveillance Study Group , Co-circulation of SARS-CoV-2 Alpha and Gamma variants in Italy, February and March 2021. Euro Surveill. 27, (2022). 10.2807/1560-7917.ES.2022.27.5.2100429 - DOI - PMC - PubMed
    1. Vöhringer H. S., Sanderson T., Sinnott M., De Maio N., Nguyen T., Goater R., Schwach F., Harrison I., Hellewell J., Ariani C. V., Gonçalves S., Jackson D. K., Johnston I., Jung A. W., Saint C., Sillitoe J., Suciu M., Goldman N., Panovska-Griffiths J., Birney E., Volz E., Funk S., Kwiatkowski D., Chand M., Martincorena I., Barrett J. C., Gerstung M.; Wellcome Sanger Institute COVID-19 Surveillance Team; COVID-19 Genomics UK (COG-UK) Consortium* , Genomic reconstruction of the SARS-CoV-2 epidemic in England. Nature 600, 506–511 (2021). - PMC - PubMed

Publication types

Substances