. 2021 Aug;6(8):1094-1101.

doi: 10.1038/s41564-021-00933-9. Epub 2021 Jun 23.

Emergence and spread of a SARS-CoV-2 lineage A variant (A.23.1) with altered spike protein in Uganda

Affiliations

¹ Medical Research Council/Uganda Virus Research Institute, London School of Hygiene & Tropical Medicine Uganda Research Unit, Entebbe, Uganda.
² Central Public Health Laboratories of the Republic of Uganda, Kampala, Uganda.
³ Institute for Evolutionary Biology, University of Edinburgh, Edinburgh, UK.
⁴ Uganda Virus Research Institute, Entebbe, Uganda.
⁵ Medical Research Council/Uganda Virus Research Institute, London School of Hygiene & Tropical Medicine Uganda Research Unit, Entebbe, Uganda. Matthew.Cotten@lshtm.ac.uk.
⁶ Medical Research Council-University of Glasgow Centre for Virus Research, Glasgow, UK. Matthew.Cotten@lshtm.ac.uk.

^# Contributed equally.

PMID: 34163035
PMCID: PMC8318884
DOI: 10.1038/s41564-021-00933-9

Emergence and spread of a SARS-CoV-2 lineage A variant (A.23.1) with altered spike protein in Uganda

Daniel Lule Bugembe et al. Nat Microbiol. 2021 Aug.

. 2021 Aug;6(8):1094-1101.

doi: 10.1038/s41564-021-00933-9. Epub 2021 Jun 23.

Authors

Affiliations

¹ Medical Research Council/Uganda Virus Research Institute, London School of Hygiene & Tropical Medicine Uganda Research Unit, Entebbe, Uganda.
² Central Public Health Laboratories of the Republic of Uganda, Kampala, Uganda.
³ Institute for Evolutionary Biology, University of Edinburgh, Edinburgh, UK.
⁴ Uganda Virus Research Institute, Entebbe, Uganda.
⁵ Medical Research Council/Uganda Virus Research Institute, London School of Hygiene & Tropical Medicine Uganda Research Unit, Entebbe, Uganda. Matthew.Cotten@lshtm.ac.uk.
⁶ Medical Research Council-University of Glasgow Centre for Virus Research, Glasgow, UK. Matthew.Cotten@lshtm.ac.uk.

^# Contributed equally.

PMID: 34163035
PMCID: PMC8318884
DOI: 10.1038/s41564-021-00933-9

Abstract

Here, we report SARS-CoV-2 genomic surveillance from March 2020 until January 2021 in Uganda, a landlocked East African country with a population of approximately 40 million people. We report 322 full SARS-CoV-2 genomes from 39,424 reported SARS-CoV-2 infections, thus representing 0.8% of the reported cases. Phylogenetic analyses of these sequences revealed the emergence of lineage A.23.1 from lineage A.23. Lineage A.23.1 represented 88% of the genomes observed in December 2020, then 100% of the genomes observed in January 2021. The A.23.1 lineage was also reported in 26 other countries. Although the precise changes in A.23.1 differ from those reported in the first three SARS-CoV-2 variants of concern (VOCs), the A.23.1 spike-protein-coding region has changes similar to VOCs including a change at position 613, a change in the furin cleavage site that extends the basic amino acid motif and multiple changes in the immunogenic N-terminal domain. In addition, the A.23.1 lineage has changes in non-spike proteins including nsp6, ORF8 and ORF9 that are also altered in other VOCs. The clinical impact of the A.23.1 variant is not yet clear and it has not been designated as a VOC. However, our findings of emergence and spread of this variant indicate that careful monitoring of this variant, together with assessment of the consequences of the spike protein changes for COVID-19 vaccine performance, are advisable.

PubMed Disclaimer

Conflict of interest statement

The authors declare no competing interests.

Figures

**Fig. 1. SARS-CoV-2 lineage diversity in Uganda.**
All high-coverage complete sequences from Uganda (n = 322) were lineage-typed using the pangolin resource (https://github.com/cov-lineages/pangolin). Lineage counts were stratified into four periods: March–May 2020 (a); June–August 2020 (b); September–November 2020 (c); and December 2020 to January 2021 (d). The percentage of each lineage within each set was plotted as a treemap using squarified treemap implemented in squarify (https://github.com/laserson/squarify) with the size of each sector proportional to the number of genomes; genome numbers are listed with ‘n = ’. Source data

**Fig. 2. Maximum-likelihood phylogenetic tree comparing all available complete and high-coverage Uganda sequences (n = 322).**
Strain names are coloured according to the case profile: cases from the community, dark red; prison, orange; truck driver, light brown; return traveller, light blue. The case clusters from prisons in Kitgum and Amuru are highlighted in colour boxes in light yellow and light green, respectively. Lineages A.23 and A.23.1 are indicated. The tree was rooted where lineages A and B were split. Branch length is drawn to the scale of the number of nucleotide substitutions per site, indicated in the lower left; only bootstrap values of major nodes are shown. Source data

**Fig. 3. Spike protein changes in lineage A.23 and A.23.1 relative to the SARS-CoV-2 reference strain (NC_045512) encoded protein.**
a, The locations of important spike protein features are indicated. b, Each line represents the encoded spike protein sequence from a single genome, ordered by date of sample collection (bottom earliest, top most recent). Sequences from Amuru in August 2020, Kitgum in September 2020 and Uganda in October, November and December 2020/January 2021 are indicated. Coloured markers indicate the positions of amino acid substitutions from the reference strain sequence; only substitutions observed in multiple genomes are annotated with the annotation (original amino acid position, new amino acid) and the labels were placed as close as possible to the substitution. c, Current global temporal distribution of A.23 and A.23.1. All available SARS-CoV-2 genomes annotated as complete and lineage A from GISAID were retrieved on 4 February 2021 and lineage-typed using pangolin and confirmed as A.23 and A.23.1 by extracting and examining the encoded spike protein. All new Uganda A.23 and A.23.1 reported in this study were also included. Genomes were plotted by country and sample collection date. Source data

**Fig. 4. Protein changes across lineage variants.**
All forward open reading frames from the 35 early lineage B SARS-CoV-2 genomes were translated and processed into 44 amino acid peptides (with 22 amino acid overlap), clustered at 0.65 identity using uclust, aligned with MAAFT and converted into pHMMs using HMMER-3 (ref. ). The presence of each domain and its bit-score (a measure of the similarity between the query sequence and the sequences used for the pHMM) was sought in each set of SARS-CoV-2 VOC genomes and the 1-mean of the normalized domain bit-scores was plotted across the genome (for example, 1—the similarity of the identified query domain to the reference lineage B SARS-CoV-2 domain). Domains are coloured by the proteins from which they were derived with the colour code is indicated below the figure. The genome positions of the indicated open reading frames are the following: nsp1: 250,805; nsp2: 806,2719; nsp3: 2720,8554; nsp4: 8555,10054; nsp53Cpro: 10055,10972; nsp6TM: 10973,11842; nsp7: 11843,12091; nsp8Rep: 12092,12685; nsp9RNAbp: 12686,13024; nsp10CysHis: 13025,13441; RDRP: 13442,16236; nsp13hel: 16237,18039; nsp14ExoN: 18040,19620; nsp15endo: 19621,20658; nsp16OMT: 20659,21552; spike: 21563,25384; ORF3a: 25393,26220; ORF4E: 26245,26472; ORF5M: 26523,27195; ORF6: 27202,27387; ORF7a: 27394,27759; ORF7b: 27756,27887; ORF8: 27894,28259; ORF9N: 28274,29533; ORF10: 29558,29712. Note that ORF7b and ORF10 are too small to be detected by this analysis method. a, The query set are 49 mostly Uganda lineage A.23.1 genomes. b, All B.1.1.7 full genomes lacking ambiguous nucleotides deposited in GISAID on 26 January 2021 are shown. c, All B.1.351 full genomes lacking ambiguous nucleotides present in GISAID on 26 January 2021 are shown. d, All P.1 full genomes lacking ambiguous nucleotides present in GISAID on 26 January 2021 are shown. Source data

**Extended Data Fig. 1. Maximum-likelihood phylogenetic tree comparing Uganda lineage A.23 and A.23.1 strains to global lineage A.23 and A.23.1 genomes.**
A maximum-likelihood (ML) phylogenetic tree comparing Ugandan A.23 and A.23.1 (n = 191) with the global A.23 and A.23.1 (N = 336). The tree was rooted by the A.23 lineage and strains were coloured according to the countries where they were identified. Branch length was drawn to the scale of number of nucleotide substitutions per site and only bootstrap values at the major nodes were shown. The tree was visualised in Figtree. Source data

**Extended Data Fig. 2. Changes in A.23/A.23.1 nsp6 protein.**
The encoded nsp6 protein from all Ugandan A.23 and A.23.1 genomes gather, aligned and compared to the nsp6 protein from GenBank NC_045512.2. Panel a: The locations of important nsp6 protein features are indicated based on the analysis of nsp6 from Benvenuto et al.. Intra_N: intravesicular amino-terminal region, Extra_loop_1: extravesicular loop1, Intra_loop_1: intravesicular loop 1, B_del 106-108: the region of nsp6 deleted in the lineage B VOC genomes, Extra_loop_Big: large extravesicular loop, Intra_loop_2: intravesicular loop 2, Extra_loop_2: extravesicular loop 2, Intra_loop_3: intravesicular loop 3, Extra_C: carboxy-terminal extra-vesicular portion. All features with ‘membrane’ indicate membrane-spanning regions of nsp6. Panel b: Each line represents the encoded nsp6 protein sequence from a single genome, ordered by date of samples collection (bottom earliest, top most recent). Coloured markers indicate the positions of amino acid (aa) substitutions from the reference strain sequence, only substitutions observed in multiple genomes are annotated with the annotation (original aa position new aa) and the labels were placed as close as possible to the substitution. Source data

**Extended Data Fig. 3. Changes in A.23/A.23.1 ORF8 protein.**
The encoded ORF8 protein from all Ugandan A.23 and A.23.1 genomes gather, aligned and compared to the ORF8 protein from GenBank NC_045512.2. Panel a: The locations of important ORF8 protein features are indicated based on the analysis of ORF8 from Flower et al.. Features with ‘Beta’ indicate beta-sheets, ORF8_specific is a region unique to SARS-CoV-2 ORF8, CLP_turn: indicates a cysteine, Leucine, Proline motif essential for a fold in the mature protein, Dimer interface2 indicates the region of the protein the forms the interface between two monomers. Panel b: Each line represents the encoded ORF8 protein sequence from a single genome, ordered by date of samples collection (bottom earliest, top most recent). Coloured markers indicate the positions of amino acid (aa) substitutions from the reference strain sequence, only substitutions observed in multiple genomes are annotated with the annotation (original aa position new aa) and the labels were placed as close as possible to the substitution. Source data

**Extended Data Fig. 4. Changes in A.23/A.23.1 ORF9 protein.**
The encoded ORF9 protein from all Ugandan A.23 and A.23.1 genomes gather, aligned and compared to the ORF9 protein from GenBank NC_045512.2. Panel a: The locations of important ORF9 protein features are indicated based on the analysis of ORF9 from Chang et al.. N-term: amino-terminal extension, NTD: amino-terminal domain, linker: linker region between the NTD and CTD, CTD: carboxy-terminal domain, C-tail: carboxy-terminal extension, Regions with ‘Basic’ indicate the 4 regions enriched in positively charged amino acids. Panel b: Each line represents the encoded ORF9 protein sequence from a single genome, ordered by date of samples collection (bottom earliest, top most recent). Coloured markers indicate the positions of amino acid (aa) substitutions from the reference strain sequence, only substitutions observed in multiple genomes are annotated with the annotation (original aa position new aa) and the labels were placed as close as possible to the substitution. Source data

**Extended Data Fig. 5**
a, Percentage of total cases reported at the end of January 2021 were plotted by district (Perc_cases, dark blue bars). Only districts reporting 10 or more cases in the period were included. For the same districts, the percentage of total genomes obtained were plotted (Perc_genomes, light blue bars). The source data for Extended Data Fig. 5a can be found in Supplementary Table 2. b, District location of cases yielding full genome sequences. The district location in Uganda of cases from which full genome sequences are plotted on a map of Uganda. Districts with >10 genomes were marked in red, 2-10 genomes marked in blue and 1 genome marked in grey. Land masses are indicated in light grey, and lakes are indicate in pale blue. Source data

See this image and copyright information in PMC

References

1. Holmes, E. C. & Zhang, Y.-Z. Novel 2019 coronavirus genome. Virological.orghttp://virological.org/t/319 (2020).
1. Li Q, et al. Early transmission dynamics in Wuhan, China, of novel coronavirus-infected pneumonia. N. Engl. J. Med. 2020;382:1199–1207. doi: 10.1056/NEJMoa2001316. - DOI - PMC - PubMed
1. Yang X, et al. Clinical course and outcomes of critically ill patients with SARS-CoV-2 pneumonia in Wuhan, China: a single-centered, retrospective, observational study. Lancet Respir. Med. 2020;8:475–481. doi: 10.1016/S2213-2600(20)30079-5. - DOI - PMC - PubMed
1. Rambaut A, et al. A dynamic nomenclature proposal for SARS-CoV-2 lineages to assist genomic epidemiology. Nat. Microbiol. 2020;5:1403–1407. doi: 10.1038/s41564-020-0770-5. - DOI - PMC - PubMed
1. Volz, E. et al. Transmission of SARS-CoV-2 Lineage B.1.1.7 in England: insights from linking epidemiological and genetic data. Preprint at medRxiv10.1101/2020.12.30.20249034 (2021).

Publication types

Actions

MeSH terms

Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions

Substances

Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions

Grants and funding

LinkOut - more resources

Full Text Sources
Medical
- MedlinePlus Health Information
Miscellaneous
- NCI CPTAC Assay Portal

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Emergence and spread of a SARS-CoV-2 lineage A variant (A.23.1) with altered spike protein in Uganda

Affiliations

Emergence and spread of a SARS-CoV-2 lineage A variant (A.23.1) with altered spike protein in Uganda

Authors

Affiliations

Abstract

Conflict of interest statement

Figures

References

Publication types

MeSH terms

Substances

Grants and funding

LinkOut - more resources

Full Text Sources

Medical

Miscellaneous