Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2021 Nov 30;4(1):1343.
doi: 10.1038/s42003-021-02858-9.

Template switching and duplications in SARS-CoV-2 genomes give rise to insertion variants that merit monitoring

Affiliations

Template switching and duplications in SARS-CoV-2 genomes give rise to insertion variants that merit monitoring

Sofya K Garushyants et al. Commun Biol. .

Abstract

The appearance of multiple new SARS-CoV-2 variants during the COVID-19 pandemic is a matter of grave concern. Some of these variants, such as B.1.617.2, B.1.1.7, and B.1.351, manifest higher infectivity and virulence than the earlier SARS-CoV-2 variants, with potential dramatic effects on the course of the pandemic. So far, analysis of new SARS-CoV-2 variants focused primarily on nucleotide substitutions and short deletions that are readily identifiable by comparison to consensus genome sequences. In contrast, insertions have largely escaped the attention of researchers although the furin site insert in the Spike (S) protein is thought to be a determinant of SARS-CoV-2 virulence. Here, we identify 346 unique inserts of different lengths in SARS-CoV-2 genomes and present evidence that these inserts reflect actual virus variance rather than sequencing artifacts. Two principal mechanisms appear to account for the inserts in the SARS-CoV-2 genomes, polymerase slippage and template switch that might be associated with the synthesis of subgenomic RNAs. At least three inserts in the N-terminal domain of the S protein are predicted to lead to escape from neutralizing antibodies, whereas other inserts might result in escape from T-cell immunity. Thus, inserts in the S protein can affect its antigenic properties and merit monitoring.

PubMed Disclaimer

Conflict of interest statement

The authors declare no competing interests.

Figures

Fig. 1
Fig. 1. Insertions in SARS-CoV-2 genomes.
a Distribution of insert lengths. b Nucleotide composition of inserts of different lengths and full SARS-CoV-2 genome. c Distribution of inserts along the genome. Each triangle represents one insertion event. The level of confidence in each variant is represented by color: dark green, confirmed by sequencing read analysis; green, monophyletic in the tree, no read data available; light green, observed multiple times, but not monophyletic; gray, singletons (Supplementary Data 2). The positions of inserts are marked with gray dashed lines. d Experimental data on SARS-CoV-2 transcriptome showing template switch hotspots during the formation of sgRNAs, showing the distribution of junction reads connecting recombination hotspots along the genome. e Distance from inserts to closest template switching hotspot site (green) compared with random expectation (gray). Wilcoxon rank sum test p value is provided. f The number of inserts that occur in structured regions of SARS-CoV-2 genomic RNA (blue) compared with random expectation (gray). Permutation test p value is provided. The data on SARS-CoV-2 structure was obtained from. The code to reproduce the figure is provided in repository (see Code availability).
Fig. 2
Fig. 2. Long insertions possibly occur through template switching and formation of nc sgRNAs.
a Each triangle shows an independent insertion event, colored as in Fig. 1. Curves on the upper side of the plot connect the insertion origin site and insertion position, brown color indicates that the origin sequence is on the same strand, and gray color shows that the origin sequence is on complementary strand, curves at the bottom of the plot represent the experimental data on sgRNAs from Kim et al.. Curves highlighted in violet correspond to the three cases when insert and corresponding origin site co-occur with sgRNA junctions. The SARS-CoV-2 genes are colored as in Fig. 1. Permutation tests show the number of template switches co-occurring with RdRp jumps (x-axis) expected at random (blue), (b) when only the positions of the origins were randomly sampled 10,000 times from the genome; c when both ends were randomly sampled. Red vertical line represents the number observed in data. The code to reproduce the permutation tests is available in repository (see Code availability).
Fig. 3
Fig. 3. Location of insertion sites in SARS-CoV-2 S protein.
a Surface representation showing that all observed insertions can potentially change surface properties (PDB ID: 7cn8). b Ins 245, 246 and 248 are located on the surface interacting with 4A8 antibody (PDB ID: 7cl2). Enlarged is the interacting surface. Cyan, N-terminal domain (NTD), wheat, receptor-binding domain (RBD), dark red, receptor-binding motif (RBM), aquamarine, heavy chain of the 4A8 antibody (PDB ID: 7cl2). Each insertion is shown in a distinct color. The models for each insertion were generated with the SWISS-model web server. c Location of insertions in the genome of SARS-CoV-2. Full description of insertions is provided in the Supplementary Data 4. Triangle size is proportional to the insert length.

Update of

References

    1. Candido DS, et al. Evolution and epidemic spread of SARS-CoV-2 in Brazil. Science. 2020;369:1255–1260. - PMC - PubMed
    1. du Plessis L, et al. Establishment and lineage dynamics of the SARS-CoV-2 epidemic in the UK. Science. 2021;371:708–712. - PMC - PubMed
    1. Munnink, B. B. O. et al. Jumping back and forth: anthropozoonotic and zoonotic transmission of SARS-CoV-2 on mink farms. bioRxiv 2020.09.01.277152 (2020) 10.1101/2020.09.01.277152.
    1. Komissarov AB, et al. Genomic epidemiology of the early stages of the SARS-CoV-2 outbreak in Russia. Nat. Commun. 2021;12:649. - PMC - PubMed
    1. Martin, D. P. et al. The emergence and ongoing convergent evolution of the N501Y lineages coincides with a major global shift in the SARS-CoV-2 selective landscape. medRxiv 2021.02.23.21252268 (2021) 10.1101/2021.02.23.21252268.

Publication types

Supplementary concepts