Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2025 Jan;97(1):e70136.
doi: 10.1002/jmv.70136.

Genetic Conservation and Diversity of SARS-CoV-2 Envelope Gene Across Variants of Concern

Affiliations

Genetic Conservation and Diversity of SARS-CoV-2 Envelope Gene Across Variants of Concern

Benjamin M Liu et al. J Med Virol. 2025 Jan.

Abstract

SARS-CoV-2 Envelope (E) protein is critical in viral assembly, release, and virulence. E gene was considered highly conserved and evolving slowly. Pan-sarbecoviruses-conserved regions in the E gene have been used as targets for various RT-PCR assays to detect SARS-CoV-2. It remains elusive whether SARS-CoV-2 variants of concern (VOCs) have accumulated significant E mutations that may affect protein stability and diagnostic RT-PCR assays. Herein we aimed to perform a comprehensive genetic analysis on the conservation and diversity of the E gene of SARS-CoV-2 and its VOCs in comparison with other human coronaviruses (HCoVs). In silico analysis of 20 326 HCoV E gene sequences retrieved from GenBank and GISAID suggests that SARS-CoV-2 E gene has multiple pan-HCoVs- and pan-SARS-CoV-2-conserved positions but accumulates significant mutations in VOC B.1.351 and Omicron strains. Mutations were often found in the 5' and 3' variable regions, whereas the central region is conserved. Nucleotide changes C109U and A114G may lead to potential failure of first-line SARS-CoV-2 diagnostic/screening assays. Nucleotide change C212U and its concomitant amino acid substitution Pro71Leu (i.e., C212U/Pro71Leu) is a hallmark mutation of B.1.351 variants, while C26U/Thr9Ile is characteristic of all Omicron variants. Later Omicron subvariants, such as XBB.1.5 and EG.5, additionally acquired the A31G/Thr11Ala mutation, as was confirmed by whole genome sequencing of SARS-CoV-2 in 118 pediatric cases. Wild-type E protein exhibits cytotoxicity to cells, but the mutations Thr9Ile, Thr11Ala, Thr9Ile + Thr11Ala, or Pro71Leu reduces its cytotoxicity. The Thr9Ile + Thr11Ala mutation stabilizes the E proteins of Omicron variants, while Pro71Leu alters the cellular distribution of the E protein, reducing its colocalization with the Golgi body. Altogether, this study not only sheds light on the conservation and diversity of the E gene in SARS-CoV-2 and its VOCs but also informs the improvement and development of SARS-CoV-2 or pan-HCoVs screening and diagnostic assays.

Keywords: COVID‐19; SARS‐CoV‐2; conservation; envelope gene; genetic diversity.

PubMed Disclaimer

Conflict of interest statement

Conflicts of Interest

The authors declare no conflicts of interest.

Figures

FIGURE 1 |
FIGURE 1 |
Phylogenetic analysis of E gene sequences of different HCoVs. A circular phylogenetic tree was created using MEGA 4.0 software after alignment of 419 E gene sequences retrieved from whole genome sequences of SARS-CoV-2 (N = 137), SARS-CoV (N = 76), MERS-CoV (N = 40), HCoV-OC43 (N = 44), HCoV-HKU1 (N = 33), HCoV-229E (N = 30), and HCoV-NL63 (N = 59). E gene sequences of the same species were represented with the same color. Seven colors were used to represent different HCoVs, whose names, sequence numbers and GenBank accession numbers were indicated at the periphery of the circular phylogenetic tree. E gene of the same species clustered on the same branches that segregated from the other species. The segregation of E gene sequences on this phylogenetic tree was consistent with the species designation based on the whole genome sequences of different HCoVs. Red and green lines in the center of the circle indicate individual parts of the tree for Alpha-CoV and Beta-CoV, respectively, starting from the bifurcation point of both genera and pointing to further branches for different lineages and species. Lineage A (HCoV-HKU1 and HCoV-OC43), B (SARS-CoV-2 and SARS-CoV), and C (MERS-CoV) of Beta-CoV were indicated.
FIGURE 2 |
FIGURE 2 |
Comparison of E gene sequences of different HCoVs and identification of pan-HCoVs–conserved nucleotide positions. A total of 20 326 E gene sequences of SARS-CoV-2 and its VOCs (B.1.1.7 [UK variants], B.1.351 [South Africa variants], P.1 [Brazil variants], B.1.617.2 [Delta variants], and Omicron B.1.1.529, XBB.1.5, and EG.5), SARS-CoV, MERS-CoV, HCoV-OC43, HCoV-HKU1, HCoV-229E, and HCoV-NL63 were retrieved from full-length genomes of corresponding viruses and aligned by ClustalW multiple alignment using BioEdit software. The reference sequence of the alignment was SARS-CoV-2 Wuhan-Hu-1 (GenBank accession no.: NC_045512). The nucleotide (nt) numbering of the alignment is shown at the top. The names of different HCoVs and their number of sequences in the parentheses are shown on the left to the sequence alignment. The length of E gene sequences varies from species to species, with stop codons shown at the end of the sequences on the right side. Twenty-five pan-HCoVs–conserved nucleotide positions with 100% conserved nucleotide identity across all the selected sequences were highlighted in colored columns. Their position numbers (according to SARS-CoV-2 E gene numbering) and nucleotide composition were shown at the bottom. Gray areas in the E gene sequences of SARS-CoV, MERS-CoV, HCoV-OC43, HCoV-HKU1, HCoV-229E, and HCoV-NL63 indicate nucleotide positions that were different than those of the reference sequence SARS-CoV-2 Wuhan-Hu-1 NC_045512.
FIGURE 3 |
FIGURE 3 |
Distribution of the identified pan-SARS-CoV-2–conserved nucleotide positions in E gene sequences. A total of 20 326 E gene sequences of SARS-CoV-2 and its VOCs (B.1.1.7 [U.K. variants], B.1.351 [South Africa variants], P.1 [Brazil variants], B.1.617.2 [Delta variants], and Omicron B.1.1.529, XBB.1.5, and EG.5), SARS-CoV, HCoV-OC43, HCoV-HKU1, MERS-CoV, HCoV-229E, and HCoV-NL63 were retrieved from full-length genomes of the corresponding viruses and aligned by ClustalW multiple alignment using BioEdit software (as shown in Figure 2). Pan-HCoVs–conserved nucleotide positions (highlighted in color in Figure 2, bottom) were shown in gray columns. Twelve pan-SARS-CoV-2–conserved nucleotide positions were shown at the bottom (according to SARS-CoV-2 E gene numbering). The nucleotide composition at these sites for different HCoVs was highlighted in columns with different colors.
FIGURE 4 |
FIGURE 4 |
Distribution of the identified important mutations of SARS-CoV-2 and its VOCs in E gene sequences, and their relative positions with binding regions of commonly used E gene PCR primers/probe. A total of 19 933 E gene sequences of SARS-CoV-2 and its VOCs (B.1.1.7 [UK variants], B.1.351 [South Africa variants], P.1 [Brazil variants], B.1.617.2 [Delta variants], and Omicron B.1.1.529, XBB.1.5, and EG.5) were retrieved from full-length genomes of corresponding viruses and aligned by ClustalW multiple alignment using BioEdit software. The reference sequence of the alignment was SARS-CoV-2 Wuhan-Hu-1 (GenBank accession no.: NC_045512). The nucleotide (nt) numbering of the alignment is shown at the top. The name of SARS-CoV-2 and its VOCs and their number of sequences in the parentheses are shown on the left to the sequence alignment. Twelve SARS-CoV-2–specific nucleotide positions are highlighted in solid, colored columns (A: green; U: red; C: blue; G: black). Twenty-five pan-HCoVs–conserved nucleotide positions (highlighted in color in Figure 2) were shown in solid gray columns. Important nucleotide changes and their concomitant amino acid substitutions listed in Table 2 and Table 3 were indicated with hatched columns in different colors and varied lengths (reflecting mutation frequencies/rates). C12U, C26U, A31G, C61U, C109U, and A114G fell into 5′ variable region whereas G184U, G195U, C203U, C212U, and C225U fell into 3′ variable region. The central conserved region between nt115 and nt180 did not have any significant mutations. To evaluate the impact of the identified significant mutations, binding regions of commonly used E gene PCR primers/probes were shown at the bottom of the alignment, including Charité/Berlin (WHO) pan-Sarbecovirus E-gene primers (E_Sarbeco_F [nt25–nt50] and E_Sarbeco_R [nt137–nt116]) and probe (E_Sarbeco_P1 [nt88-nt113]) [18] and IBS_E2 primers (F [nt15–nt36] and R [nt130–nt112]) [22].
FIGURE 5 |
FIGURE 5 |
Structural representation of key missense mutations in the E protein. ΔΔG is shown in parentheses. The mutant residues depicted in red are destabilizing the E protein (middle panels), while those shown in blue are stabilizing it (left and right panels). Orange part indicates the wild-type residues.
FIGURE 6 |
FIGURE 6 |
The effects of E gene mutations on its protein stability. Vector expressing HA-tagged WT or mutant (Mut) E gene (E-WT/Mut [Thr9Ile, Thr11Ala, Thr9Ile + Thr11Ala, or Pro71Leu]) was transfected into HEK293T cells for 10 h, followed by treatment without or with the indicated concentration of MG-132. At 24 h post transfection, cell lysates were collected and subjected to immunoblot with anti-HA and anti-tubulin (A) or total RNA was isolated and used for RT-PCR (B). Tubulin and GAPDH serve as loading controls for WB and RT-qPCR, respectively. N.S., not statistically significant.
FIGURE 7 |
FIGURE 7 |
Cytotoxic effects of WT E protein and its mutants of Thr9Ile, Pro71Leu, Thr11Ala, or Thr9Ile + Thr11Ala on HEK293 cells. Empty vectors or the vector encoding HA-tagged WT or mutant (Thr9Ile, Thr11Ala, Thr9Ile + Thr11Ala, or Pro71Leu) E gene were transfected into HEK293 cells for 48 h, followed by MTT assay. Double and triple asterisks denote that statistical differences exist with a p value of < 0.01 and < 0.001, respectively.
FIGURE 8 |
FIGURE 8 |
Subcellar distribution of WT and mutant E proteins. Vector encoding HA-tagged WT or mutant (Thr9Ile, Thr11Ala, Thr9Ile + Thr11Ala, or Pro71Leu) E gene were cotransfected into HEK293 cells with Golgi-marker–expressing plasmid, followed by immunostaining and confocal microscopy. WT or mutated E proteins were shown in red by anti-HA antibody, and Golgi is shown green. DAPI and merge panels are also shown.

References

    1. Zhou P, Yang XL, Wang XG, et al. , “A Pneumonia Outbreak Associated With a New Coronavirus of Probable Bat Origin,” Nature 579 (2020): 270–273, 10.1038/s41586-020-2012-7. - DOI - PMC - PubMed
    1. Fauci AS, Lane HC, and Redfield RR, “Covid-19—Navigating the Uncharted,” New England Journal of Medicine 382 (2020): 1268–1269, 10.1056/NEJMe2002387. - DOI - PMC - PubMed
    1. Liu BM and Hill HR, “Role of Host Immune and Inflammatory Responses in COVID-19 Cases With Underlying Primary Immunodeficiency: A Review,” Journal of Interferon & Cytokine Research 40, no. 12 (December 2020): 549–554, 10.1089/jir.2020.0210. - DOI - PMC - PubMed
    1. Liu BM, Martins TB, Peterson LK, and Hill HR, “Clinical Significance of Measuring Serum Cytokine Levels As Inflammatory Biomarkers in Adult and Pediatric COVID-19 Cases: A Review,” Cytokine 142 (June 2021): 155478, 10.1016/j.cyto.2021.155478. - DOI - PMC - PubMed
    1. Liu BM, Carlisle CP, Fisher MA, and Shakir SM, “The Brief Case: Capnocytophaga sputigena Bacteremia in a 94-Year-Old Male With Type 2 Diabetes Mellitus, Pancytopenia, and Bronchopneumonia,” Journal of Clinical Microbiology 59, no. 7 (June 2021): e0247220, 10.1128/JCM.02472-20. - DOI - PMC - PubMed

Substances

Supplementary concepts