Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2021 Aug;6(8):1094-1101.
doi: 10.1038/s41564-021-00933-9. Epub 2021 Jun 23.

Emergence and spread of a SARS-CoV-2 lineage A variant (A.23.1) with altered spike protein in Uganda

Affiliations

Emergence and spread of a SARS-CoV-2 lineage A variant (A.23.1) with altered spike protein in Uganda

Daniel Lule Bugembe et al. Nat Microbiol. 2021 Aug.

Abstract

Here, we report SARS-CoV-2 genomic surveillance from March 2020 until January 2021 in Uganda, a landlocked East African country with a population of approximately 40 million people. We report 322 full SARS-CoV-2 genomes from 39,424 reported SARS-CoV-2 infections, thus representing 0.8% of the reported cases. Phylogenetic analyses of these sequences revealed the emergence of lineage A.23.1 from lineage A.23. Lineage A.23.1 represented 88% of the genomes observed in December 2020, then 100% of the genomes observed in January 2021. The A.23.1 lineage was also reported in 26 other countries. Although the precise changes in A.23.1 differ from those reported in the first three SARS-CoV-2 variants of concern (VOCs), the A.23.1 spike-protein-coding region has changes similar to VOCs including a change at position 613, a change in the furin cleavage site that extends the basic amino acid motif and multiple changes in the immunogenic N-terminal domain. In addition, the A.23.1 lineage has changes in non-spike proteins including nsp6, ORF8 and ORF9 that are also altered in other VOCs. The clinical impact of the A.23.1 variant is not yet clear and it has not been designated as a VOC. However, our findings of emergence and spread of this variant indicate that careful monitoring of this variant, together with assessment of the consequences of the spike protein changes for COVID-19 vaccine performance, are advisable.

PubMed Disclaimer

Conflict of interest statement

The authors declare no competing interests.

Figures

Fig. 1
Fig. 1. SARS-CoV-2 lineage diversity in Uganda.
All high-coverage complete sequences from Uganda (n = 322) were lineage-typed using the pangolin resource (https://github.com/cov-lineages/pangolin). Lineage counts were stratified into four periods: March–May 2020 (a); June–August 2020 (b); September–November 2020 (c); and December 2020 to January 2021 (d). The percentage of each lineage within each set was plotted as a treemap using squarified treemap implemented in squarify (https://github.com/laserson/squarify) with the size of each sector proportional to the number of genomes; genome numbers are listed with ‘n = ’. Source data
Fig. 2
Fig. 2. Maximum-likelihood phylogenetic tree comparing all available complete and high-coverage Uganda sequences (n = 322).
Strain names are coloured according to the case profile: cases from the community, dark red; prison, orange; truck driver, light brown; return traveller, light blue. The case clusters from prisons in Kitgum and Amuru are highlighted in colour boxes in light yellow and light green, respectively. Lineages A.23 and A.23.1 are indicated. The tree was rooted where lineages A and B were split. Branch length is drawn to the scale of the number of nucleotide substitutions per site, indicated in the lower left; only bootstrap values of major nodes are shown. Source data
Fig. 3
Fig. 3. Spike protein changes in lineage A.23 and A.23.1 relative to the SARS-CoV-2 reference strain (NC_045512) encoded protein.
a, The locations of important spike protein features are indicated. b, Each line represents the encoded spike protein sequence from a single genome, ordered by date of sample collection (bottom earliest, top most recent). Sequences from Amuru in August 2020, Kitgum in September 2020 and Uganda in October, November and December 2020/January 2021 are indicated. Coloured markers indicate the positions of amino acid substitutions from the reference strain sequence; only substitutions observed in multiple genomes are annotated with the annotation (original amino acid position, new amino acid) and the labels were placed as close as possible to the substitution. c, Current global temporal distribution of A.23 and A.23.1. All available SARS-CoV-2 genomes annotated as complete and lineage A from GISAID were retrieved on 4 February 2021 and lineage-typed using pangolin and confirmed as A.23 and A.23.1 by extracting and examining the encoded spike protein. All new Uganda A.23 and A.23.1 reported in this study were also included. Genomes were plotted by country and sample collection date. Source data
Fig. 4
Fig. 4. Protein changes across lineage variants.
All forward open reading frames from the 35 early lineage B SARS-CoV-2 genomes were translated and processed into 44 amino acid peptides (with 22 amino acid overlap), clustered at 0.65 identity using uclust, aligned with MAAFT and converted into pHMMs using HMMER-3 (ref. ). The presence of each domain and its bit-score (a measure of the similarity between the query sequence and the sequences used for the pHMM) was sought in each set of SARS-CoV-2 VOC genomes and the 1-mean of the normalized domain bit-scores was plotted across the genome (for example, 1—the similarity of the identified query domain to the reference lineage B SARS-CoV-2 domain). Domains are coloured by the proteins from which they were derived with the colour code is indicated below the figure. The genome positions of the indicated open reading frames are the following: nsp1: 250,805; nsp2: 806,2719; nsp3: 2720,8554; nsp4: 8555,10054; nsp53Cpro: 10055,10972; nsp6TM: 10973,11842; nsp7: 11843,12091; nsp8Rep: 12092,12685; nsp9RNAbp: 12686,13024; nsp10CysHis: 13025,13441; RDRP: 13442,16236; nsp13hel: 16237,18039; nsp14ExoN: 18040,19620; nsp15endo: 19621,20658; nsp16OMT: 20659,21552; spike: 21563,25384; ORF3a: 25393,26220; ORF4E: 26245,26472; ORF5M: 26523,27195; ORF6: 27202,27387; ORF7a: 27394,27759; ORF7b: 27756,27887; ORF8: 27894,28259; ORF9N: 28274,29533; ORF10: 29558,29712. Note that ORF7b and ORF10 are too small to be detected by this analysis method. a, The query set are 49 mostly Uganda lineage A.23.1 genomes. b, All B.1.1.7 full genomes lacking ambiguous nucleotides deposited in GISAID on 26 January 2021 are shown. c, All B.1.351 full genomes lacking ambiguous nucleotides present in GISAID on 26 January 2021 are shown. d, All P.1 full genomes lacking ambiguous nucleotides present in GISAID on 26 January 2021 are shown. Source data
Extended Data Fig. 1
Extended Data Fig. 1. Maximum-likelihood phylogenetic tree comparing Uganda lineage A.23 and A.23.1 strains to global lineage A.23 and A.23.1 genomes.
A maximum-likelihood (ML) phylogenetic tree comparing Ugandan A.23 and A.23.1 (n = 191) with the global A.23 and A.23.1 (N = 336). The tree was rooted by the A.23 lineage and strains were coloured according to the countries where they were identified. Branch length was drawn to the scale of number of nucleotide substitutions per site and only bootstrap values at the major nodes were shown. The tree was visualised in Figtree. Source data
Extended Data Fig. 2
Extended Data Fig. 2. Changes in A.23/A.23.1 nsp6 protein.
The encoded nsp6 protein from all Ugandan A.23 and A.23.1 genomes gather, aligned and compared to the nsp6 protein from GenBank NC_045512.2. Panel a: The locations of important nsp6 protein features are indicated based on the analysis of nsp6 from Benvenuto et al.. Intra_N: intravesicular amino-terminal region, Extra_loop_1: extravesicular loop1, Intra_loop_1: intravesicular loop 1, B_del 106-108: the region of nsp6 deleted in the lineage B VOC genomes, Extra_loop_Big: large extravesicular loop, Intra_loop_2: intravesicular loop 2, Extra_loop_2: extravesicular loop 2, Intra_loop_3: intravesicular loop 3, Extra_C: carboxy-terminal extra-vesicular portion. All features with ‘membrane’ indicate membrane-spanning regions of nsp6. Panel b: Each line represents the encoded nsp6 protein sequence from a single genome, ordered by date of samples collection (bottom earliest, top most recent). Coloured markers indicate the positions of amino acid (aa) substitutions from the reference strain sequence, only substitutions observed in multiple genomes are annotated with the annotation (original aa position new aa) and the labels were placed as close as possible to the substitution. Source data
Extended Data Fig. 3
Extended Data Fig. 3. Changes in A.23/A.23.1 ORF8 protein.
The encoded ORF8 protein from all Ugandan A.23 and A.23.1 genomes gather, aligned and compared to the ORF8 protein from GenBank NC_045512.2. Panel a: The locations of important ORF8 protein features are indicated based on the analysis of ORF8 from Flower et al.. Features with ‘Beta’ indicate beta-sheets, ORF8_specific is a region unique to SARS-CoV-2 ORF8, CLP_turn: indicates a cysteine, Leucine, Proline motif essential for a fold in the mature protein, Dimer interface2 indicates the region of the protein the forms the interface between two monomers. Panel b: Each line represents the encoded ORF8 protein sequence from a single genome, ordered by date of samples collection (bottom earliest, top most recent). Coloured markers indicate the positions of amino acid (aa) substitutions from the reference strain sequence, only substitutions observed in multiple genomes are annotated with the annotation (original aa position new aa) and the labels were placed as close as possible to the substitution. Source data
Extended Data Fig. 4
Extended Data Fig. 4. Changes in A.23/A.23.1 ORF9 protein.
The encoded ORF9 protein from all Ugandan A.23 and A.23.1 genomes gather, aligned and compared to the ORF9 protein from GenBank NC_045512.2. Panel a: The locations of important ORF9 protein features are indicated based on the analysis of ORF9 from Chang et al.. N-term: amino-terminal extension, NTD: amino-terminal domain, linker: linker region between the NTD and CTD, CTD: carboxy-terminal domain, C-tail: carboxy-terminal extension, Regions with ‘Basic’ indicate the 4 regions enriched in positively charged amino acids. Panel b: Each line represents the encoded ORF9 protein sequence from a single genome, ordered by date of samples collection (bottom earliest, top most recent). Coloured markers indicate the positions of amino acid (aa) substitutions from the reference strain sequence, only substitutions observed in multiple genomes are annotated with the annotation (original aa position new aa) and the labels were placed as close as possible to the substitution. Source data
Extended Data Fig. 5
Extended Data Fig. 5
a, Percentage of total cases reported at the end of January 2021 were plotted by district (Perc_cases, dark blue bars). Only districts reporting 10 or more cases in the period were included. For the same districts, the percentage of total genomes obtained were plotted (Perc_genomes, light blue bars). The source data for Extended Data Fig. 5a can be found in Supplementary Table 2. b, District location of cases yielding full genome sequences. The district location in Uganda of cases from which full genome sequences are plotted on a map of Uganda. Districts with >10 genomes were marked in red, 2-10 genomes marked in blue and 1 genome marked in grey. Land masses are indicated in light grey, and lakes are indicate in pale blue. Source data

References

    1. Holmes, E. C. & Zhang, Y.-Z. Novel 2019 coronavirus genome. Virological.orghttp://virological.org/t/319 (2020).
    1. Li Q, et al. Early transmission dynamics in Wuhan, China, of novel coronavirus-infected pneumonia. N. Engl. J. Med. 2020;382:1199–1207. doi: 10.1056/NEJMoa2001316. - DOI - PMC - PubMed
    1. Yang X, et al. Clinical course and outcomes of critically ill patients with SARS-CoV-2 pneumonia in Wuhan, China: a single-centered, retrospective, observational study. Lancet Respir. Med. 2020;8:475–481. doi: 10.1016/S2213-2600(20)30079-5. - DOI - PMC - PubMed
    1. Rambaut A, et al. A dynamic nomenclature proposal for SARS-CoV-2 lineages to assist genomic epidemiology. Nat. Microbiol. 2020;5:1403–1407. doi: 10.1038/s41564-020-0770-5. - DOI - PMC - PubMed
    1. Volz, E. et al. Transmission of SARS-CoV-2 Lineage B.1.1.7 in England: insights from linking epidemiological and genetic data. Preprint at medRxiv10.1101/2020.12.30.20249034 (2021).

Publication types

Substances