Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2021 Dec;7(12):000734.
doi: 10.1099/mgen.0.000734.

SARS-CoV-2 genetic variations associated with COVID-19 pathogenicity

Affiliations

SARS-CoV-2 genetic variations associated with COVID-19 pathogenicity

Pakorn Aiewsakun et al. Microb Genom. 2021 Dec.

Abstract

In this study, we performed genome-wide association analyses on SARS-CoV-2 genomes to identify genetic mutations associated with pre-symptomatic/asymptomatic COVID-19 cases. Various potential covariates and confounding factors of COVID-19 severity, including patient age, gender and country, as well as virus phylogenetic relatedness were adjusted for. In total, 3021 full-length genomes of SARS-CoV-2 generated from original clinical samples and whose patient status could be determined conclusively as either 'pre-symptomatic/asymptomatic' or 'symptomatic' were retrieved from the GISAID database. We found that the mutation 11 083G>T, located in the coding region of non-structural protein 6, is significantly associated with asymptomatic COVID-19. Patient age is positively correlated with symptomatic infection, while gender is not significantly correlated with the development of the disease. We also found that the effects of the mutation, patient age and gender do not vary significantly among countries, although each country appears to have varying baseline chances of COVID-19 symptom development.

Keywords: GWAS; SARS-CoV-2; nsp6.

PubMed Disclaimer

Conflict of interest statement

The authors declare that there are no conflicts of interest.

Figures

Fig. 1.
Fig. 1.
SARS-CoV-2 phylogeny. The tree was estimated under the maximum-likelihood framework implemented in IQ-TREE2 [27] based on a manually curated alignment of 3021 full-length SARS-CoV-2 genomes. Potential recombination within the alignment was checked by using the Phi test implemented in SplitsTree4 [26], but no evidence was found (P=0.91). The best-fit nucleotide substitution model was determined to be GTR+F+R5 (the general time reversible model+empirical base frequencies+the 5-discrete-rate-category FreeRate model) by ModelFinder [28] under the Bayesian information criterion and was used for tree reconstruction. We compared our tree with the global tree obtained from GISAID, and determined the terminal branch leading to sample EPI_ISL_407976 as a suitable location for root placement. Bar, substitutions per site. The tree file in Newick format with bootstrap clade-support values, computed based on 1000 bootstrap trees, can be found in Data S1. The three columns on the right indicate the United Nations (UN) geoscheme subregion, the GISAID haplogroup assignment and the patient status of the sequences, respectively (see keys). The World map below the tree shows the countries from which the sequences were sampled, colored according to the UN geoscheme subregions.
Fig. 2.
Fig. 2.
Screening for candidate sites with genetic variations associated with COVID-19 pathogenicity by using TreeWAS [29]. (a) Maximum-likelihood tree of SARS-CoV-2 (as shown in Fig. 1) is shown on the left. Bar, substitutions per site. The viruses’ patient status (pat. stat.) and mutational profiles of the 26 polymorphic sites investigated (Table 1) are shown on the right (see keys for details). Sites determined as strongly linked loci are indicated with black horizontal bars and numbers on the top. (b) Three separate tests of genotype–phenotype association implemented in the software TreeWAS [29] were performed, namely ‘Terminal’ (left), ‘Simultaneous’ (middle) and ‘Subsequent’ tests (right) with Bonferroni multiple-testing correction (adjusted P value threshold=5 %/17 sets of polymorphic sites analysed=0.294 %). To account for phylogenetic uncertainty, the tests were applied to the entire distribution of the 1000 bootstrap trees to obtain the distributions of correlation scores and null scores (Cor. score null dist.). The horizontal red strips indicate the 95 % highest density intervals of the score cut-offs obtained from the 1000 bootstrap analyses. The horizontal red dotted lines indicate the score cut-off obtained from the maximum-likelihood tree analysis. All tests revealed that site 11 083 had the highest scores (horizontal red solid lines). Simultaneous tests suggested that site 11 083 was the only site with genetic variations significantly associated with COVID-19 patient status (marked with an asterisk, positive bootstrap testing rate=58.5 %), while the other two tests did not detect significant signals.
Fig. 3.
Fig. 3.
Adjusted odds ratios and 95 % confidence intervals of various potential risk factors for COVID-19 symptom development. The values were estimated based on the best-fit binomial generalized linear-mixed model M2, in which the effects of the mutation 11 083G>T, patient gender and age on the disease outcome were treated as fixed effects, and the effects of country sampling and virus phylogenetic relatedness were considered random effects. The model allowed each individual virus and country to have varying baseline chances of symptom development while adjusting for the virus phylogenetic structure. See model specification in Table S6 and estimated parameter values in Table S7.

References

    1. Aiewsakun P, Nilplub P, Wongtrakoongate P, Hongeng S, Thitithanyanont A. SARS-CoV-2 genetic variations associated with COVID-19 pathogenicity. Figshare. 2021 doi: 10.6084/m9.figshare.16528950.v1. - DOI - PMC - PubMed
    1. Zhu N, Zhang D, Wang W, Li X, Yang B, et al. A novel coronavirus from patients with pneumonia in China, 2019. N Engl J Med. 2020;382:727–733. doi: 10.1056/NEJMoa2001017. - DOI - PMC - PubMed
    1. Gorbalenya AE, Baker SC, Baric RS, de Groot RJ, Drosten C, et al. The species Severe acute respiratory syndrome-related coronavirus: Classifying 2019-nCoV and naming it SARS-CoV-2. Nat Microbiol. 2020;5:536–544. doi: 10.1038/s41564-020-0695-z. - DOI - PMC - PubMed
    1. Huang C, Wang Y, Li X, Ren L, Zhao J, et al. Clinical features of patients infected with 2019 novel coronavirus in Wuhan, China. Lancet. 2020;395:497–506. doi: 10.1016/S0140-6736(20)30183-5. - DOI - PMC - PubMed
    1. Chen N, Zhou M, Dong X, Qu J, Gong F, et al. Epidemiological and clinical characteristics of 99 cases of 2019 novel coronavirus pneumonia in Wuhan, China: a descriptive study. Lancet. 2020;395:507–513. doi: 10.1016/S0140-6736(20)30211-7. - DOI - PMC - PubMed

Publication types