Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2025 Aug;93(4):478-493.
doi: 10.1007/s00239-025-10256-6. Epub 2025 Jun 21.

Possible Acquisition and Molecular Evolution of vpu Genes Inferred from Comprehensive Sequence Analysis of Human and Simian Immunodeficiency Viruses

Affiliations

Possible Acquisition and Molecular Evolution of vpu Genes Inferred from Comprehensive Sequence Analysis of Human and Simian Immunodeficiency Viruses

Miu Naruki et al. J Mol Evol. 2025 Aug.

Abstract

Vpu, an accessory protein of human immunodeficiency virus-1 (HIV-1), plays a crucial role in viral particle production and significantly contributes to HIV virulence. However, the evolution of the vpu gene remains poorly understood. We conducted a computational analysis of approximately 39,000 simian immunodeficiency virus (SIV) and HIV sequences, focusing on 141 representative Vpu proteins. Phylogenetic analysis classified the SIV and HIV strains into four major types based on their Vpu proteins: Vpu-type 1 (ancestral, found in SIVs such as SIVmon and SIVgsn), Vpu-type 2 (SIVgor and HIV-1 group O), Vpu-type 3 (SIVcpz), and Vpu-type 4 (HIV-1 group M and N). Notably, Vpu-type 1 exhibited variability in gene length, genome length, and the overlap between vpu and env compared with other Vpu-types. A phylogenetic tree was constructed using 426 nucleotide sequences from HIV-1, HIV-2, and SIVs focusing on the region between the pol and env genes. Vpu-type 1 was closely clustered with SIVasc and SIVsyk, lacking both vpu and vpx. The similarities observed between vpu and genes such as vpr and env suggest that vpu originated within the SIV genome. In addition, a phylogenetic tree constructed from 252 Vpu-type 4a sequences from the HIV pandemic strain and 135 sequences of circulating recombinant forms of HIV-1 revealed 18 distinct protein subtypes, exceeding the number of previously recognized subtypes. The systematic analysis of the sequences from large datasets has enabled a detailed characterization of the transition states of vpu, enhancing our understanding of the processes driving viral diversity.

Keywords: Bioinformatics; HIV-1; Molecular evolution; Phylogenetic tree; SIV; Vpu.

PubMed Disclaimer

Conflict of interest statement

Declarations. Conflict of Interest: The authors declare that they have no conflicts of interest.

Figures

Fig. 1
Fig. 1
Midpoint-rooted phylogenetic tree of Vpu proteins of representative HIV-1 and SIV strains. The tree was constructed using 141 Vpu protein sequences (see Supplementary table S1) with 1,000 bootstrap replicates. The annotations for each sequence are presented in the following order: viral strain, viral group, and GenBank protein ID. These annotations are color-coded according to the viral host; human-derived sequences are shown in red and sequences from apes and monkeys are shown in blue. Additionally, the DSGxxS motif corresponding to each sequence is displayed, and conserved amino acid residues shared between types are highlighted in yellow. HIV-1 group M sequences were collapsed due to the vast number of sequences and are shown as the black triangle on the tree. Five types of Vpu proteins (Vpu-types 1, 2, 3, 4a, and 4b) were classified based on the clustering of the evolutionary lineages. The scale bar below the tree indicates 0.3 (30%) amino acid substitutions per site, and the bootstrap values (% of 1,000 replicates) are shown at each node
Fig. 2
Fig. 2
Multiple-amino-acid sequence alignment of the Vpu proteins. A Protein structure of Vpu. The transmembrane domain and cytoplasmic domain regions in the Vpu protein are marked at the top. Conserved regions are displayed in the consensus sequence, with non-conserved residues marked with ‘ + ’. The conservation histogram shows the conservation of residues based on physicochemical properties. The multi-Harmony histogram shows the subtype-specific residues. B Multiple sequence alignment (n = 50) of each type of Vpu protein, colored according to the ClustalX color scheme. Important functional motifs and residues are indicated below the alignment. The virus names corresponding to the sequences are provided in Supplementary fig. S2. The table on the right shows the ability (+) or inability (−) of each type to antagonize tetherin (BST-2) or degrade cellular CD4. Functions that are yet unknown are indicated as UN. *Strain RBF206 is the only strain in group O known to be active against tetherin (Mack et al. 2017)
Fig. 3
Fig. 3
Boxplots comparing the sizes of HIV-1 and SIV vpu genes, lengths of overlap, and whole-genome lengths. Boxplots of the nucleotide (nt) lengths of vpu (A), the length of the overlap between vpu and env (B), and the whole-genome lengths (C) are shown (total sequences, n = 56; see Supplementary table S3). The line inside each box marks the median length, and the whiskers extend to the smallest and largest values within 1.5 times the interquartile range from the 25th and 75th percentiles. Outliers are represented by individual points beyond the whiskers. The numbers below each figure indicate the Vpu-type. Tukey’s method was used to identify significant differences between means (*p < 0.05, **p < 0.01)
Fig. 4
Fig. 4
Midpoint-rooted phylogenetic tree of the nucleotide sequences in the pol–env region. The tree was constructed using 426 sequences (see Supplementary table S1) with 1,000 bootstrap replicates. The region of focus is shown in Supplementary fig. S1. Annotations of the viral strains and GenBank IDs, extracted from GenBank, are colored according to type: vpuvpx (black), Vpx-type (orange), Vpu-type 1 (red), Vpu-type 2 (brown), Vpu-type 3 (green), and Vpu-type 4 (blue). SIVmac/smm, SIVcpz, HIV-1 group M, HIV-1 group O, and HIV-2 were collapsed and are shown as black triangles. The scale bar below the tree indicates 0.3 (30%) nucleotide substitutions per site, and the bootstrap values (% of 1,000 replicates) are shown at each node
Fig. 5
Fig. 5
Unrooted phylogenetic tree of Vpu proteins showing the geographic distributions of the viruses. The tree was constructed using 252 sequences (see Supplementary table S4 and S6) with 1,000 bootstrap replicates. The branches are colored according to the official viral subtypes; Orange: subtype A; red: subtype B; strawberry pink: variant B; dark blue: subtype C; light blue: subtype D; magenta: subtype F; yellow: subtype G; dark green: subtype H; brown: subtype J; light orange: subtype K; cyan blue: subtype L. The pie chart shows the geographic prevalence of the protein subtypes, using the following colors: Asia (light green), Africa (yellow), Oceania (purple), Europe (light blue), North America (light orange), and South America (pink). The scale bar below the tree indicates 0.2 (20%) amino acid substitutions per site, and the bootstrap values (% of 1,000 replicates) are shown at each node
Fig. 6
Fig. 6
Proposed model of vpu gene evolution in SIV and HIV-1 strains. The types are labeled above each image. The viral strain and name of each monkey species are written in the individually colored boxes. Black arrows indicate the possible ancestors of SIVmon and SIVden. Vpu evolved through processes of optimization to create an ideal gene. Yellow stars mark the points of recombination and transmission events that contributed to the evolution of the Vpu gene. In our evolutionary model, we incorporated images of multiple monkey faces, after cropping them from their original sizes. The photographs were obtained from Irasutoya, Freepik by brgfx (https://www.freepik.com/) and Wikimedia Commons. The photographers from Wikimedia commons are credited as: Michael Gäbler, Alena Houšková, Thomas Springer, Peggy Motsch, Laetitia C, Paul Harrison, Wookiemedia, Aaron Logan, Jack Hynes, Six Plus by Libé and Madhero88. The images are distributed under the CC BY 3.0, CC0 1.0, CC BY-SA 3.0, CC BY-SA 4.0, and CC BY 2.5 licenses

Similar articles

References

    1. Aghokeng AF, Ayouba A, Mpoudi-Ngole E et al (2010) Extensive survey on the prevalence and genetic diversity of SIVs in primate bushmeat provides insights into risks for potential new cross-species transmissions. Infect Genet Evol 10:386–396. 10.1016/j.meegid.2009.04.014 - PMC - PubMed
    1. Bailes E, Gao F, Bibollet-Ruche F et al (2003) Hybrid origin of SIV in chimpanzees. Science 300:1713. 10.1126/science.1080657 - PubMed
    1. Bbosa N, Kaleebu P, Ssemwanga D (2019) HIV subtype diversity worldwide. Curr Opin HIV AIDS 14:153–160. 10.1097/COH.0000000000000534 - PubMed
    1. Bi X, Liu LF (1996) A replicational model for DNA recombination between direct repeats. J Mol Biol 256:849–858. 10.1006/jmbi.1996.0131 - PubMed
    1. Bibollet-Ruche F, Bailes E, Gao F et al (2004) New simian immunodeficiency virus infecting De Brazza’s monkeys (Cercopithecus neglectus): evidence for a cercopithecus monkey virus clade. J Virol 78:7748–7762. 10.1128/JVI.78.14.7748-7762.2004 - PMC - PubMed

Substances

LinkOut - more resources