Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2020 Oct 26;10(1):18289.
doi: 10.1038/s41598-020-74050-8.

The global population of SARS-CoV-2 is composed of six major subtypes

Affiliations

The global population of SARS-CoV-2 is composed of six major subtypes

Ivair José Morais et al. Sci Rep. .

Abstract

The World Health Organization characterized COVID-19 as a pandemic in March 2020, the second pandemic of the twenty-first century. Expanding virus populations, such as that of SARS-CoV-2, accumulate a number of narrowly shared polymorphisms, imposing a confounding effect on traditional clustering methods. In this context, approaches that reduce the complexity of the sequence space occupied by the SARS-CoV-2 population are necessary for robust clustering. Here, we propose subdividing the global SARS-CoV-2 population into six well-defined subtypes and 10 poorly represented genotypes named tentative subtypes by focusing on the widely shared polymorphisms in nonstructural (nsp3, nsp4, nsp6, nsp12, nsp13 and nsp14) cistrons and structural (spike and nucleocapsid) and accessory (ORF8) genes. The six subtypes and the additional genotypes showed amino acid replacements that might have phenotypic implications. Notably, three mutations (one of them in the Spike protein) were responsible for the geographical segregation of subtypes. We hypothesize that the virus subtypes detected in this study are records of the early stages of SARS-CoV-2 diversification that were randomly sampled to compose the virus populations around the world. The genetic structure determined for the SARS-CoV-2 population provides substantial guidelines for maximizing the effectiveness of trials for testing candidate vaccines or drugs.

PubMed Disclaimer

Conflict of interest statement

The authors declare no competing interests.

Figures

Figure 1
Figure 1
Mean pairwise number of nucleotide differences per site (nucleotide diversity, π) calculated using a sliding window of 300 nucleotides across the multiple sequence alignment for full-length genomes of SARS-CoV-2. The red dashed line at π = 0.001 represents an arbitrary threshold used to subdivide the segments (S) with higher (S2, 4, 6, 8, 10, 12, 14 and 16) and lower (S1, 3, 5, 7, 9, 11, 13,15 and 17) levels of genetic variation. The SARS-CoV-2 genome organization is represented on top of the plot.
Figure 2
Figure 2
Multidimensional scaling (MDS) visualization of tree distances based on the Kendall-Colijn metric (λ = 0). The seventeen ML trees (each with 593 tips) are represented as dots, and groups of trees showing similar topologies are indicated by the same colour. The WSP-containing segment-based trees formed six groups: the first group comprised S2, S8 and S12 (indicated in blue), while the other five were represented by single trees (groups 2–6 indicated in red, green, orange, purple and brown, respectively). All nWSP-containing segment-based ML trees formed a single group, indicated in pink.
Figure 3
Figure 3
Geographical distribution of six subtypes of SARS-CoV-2 around the world. The genomic data set comprised isolates sampled from 40 distinct countries from December 24, 2019 to March 20, 2020. The pie charts show the proportion of each subtype of SARS-CoV-2 according to a colour key in the figure bottom. For more detailed information on virus spread, a dynamic map is available at https://microreact.org/project/f25A3jAvE5TjzxAf38UCEq (accessible via the QR code in the bottom left corner of the map).
Figure 4
Figure 4
Maximum likelihood phylogenetic tree based on 12 WSPs detected across the SARS-CoV-2 genomes. The background colour of the tips indicates the subtype (I–VI) or tentative subtype (VII–XVI). An outer strip indicates the geographic origin (Western or Eastern Hemisphere) and whether each isolate was subjected to intermediate cell culture passages before genome sequencing.

References

    1. Wu F, et al. A new coronavirus associated with human respiratory disease in China. Nature. 2020;579:265–269. doi: 10.1038/s41586-020-2008-3. - DOI - PMC - PubMed
    1. Coronaviridae Study Group of the International Committee on Taxonomy of Viruses The species Severe acute respiratory syndrome-related coronavirus: classifying 2019-nCoV and naming it SARS-CoV-2. Nat. Microbiol. 2020;5:536. doi: 10.1038/s41564-020-0695-z. - DOI - PMC - PubMed
    1. WHO. WHO Director-General’s remarks at the media briefing on 2019-nCoV on 11 February 2020. WHO website.https://www.who.int/dg/speeches/detail/who-directo (2020). Accessed 10 Apr 2020.
    1. Sawicki SG, Sawicki DL. Coronavirus transcription: a perspective. Curr. Top. Microbiol. Immunol. 2005;287:31–55. - PMC - PubMed
    1. de Wilde, A. H., Snijder, E. J., Kikkert, M. & van Hemert, M. J. Host factors in coronavirus replication. In Assessment and Evaluation in Higher Education vol. 37, 1–42 (Springer, Berlin, 2017). - PMC - PubMed

Publication types

MeSH terms

Substances