Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2025 Aug 20;13(1):188.
doi: 10.1186/s40168-025-02185-9.

The genetic diversity and populational specificity of the human gut virome at single-nucleotide resolution

Affiliations

The genetic diversity and populational specificity of the human gut virome at single-nucleotide resolution

Xiuchao Wang et al. Microbiome. .

Abstract

Background: Large-scale characterization of gut viral genomes provides strain-resolved insights into host-microbe interactions. However, existing viral genomes are mainly derived from Western populations, limiting our understanding of global gut viral diversity and functional variations necessary for personalized medicine and addressing regional health disparities.

Results: Here, we introduce the Chinese Gut Viral Reference (CGVR) set, consisting of 120,568 viral genomes from 3234 deeply sequenced fecal samples collected nationwide, covering 72,751 viral operational taxonomic units (vOTUs), nearly 90% of which are likely absent from current databases. Analysis of single-nucleotide variations (SNVs) in 233 globally prevalent vOTUs revealed that 18.9% showed significant genetic stratification between Chinese and non-Chinese populations, potentially linked to bacterial infection susceptibility. The predicted bacterial hosts of population-stratified viruses exhibit distinct genetic components associated with health-related functions, including multidrug resistance. Additionally, viral strain diversity at the SNV level correlated with human phenotypic traits, such as age and gastrointestinal issues like constipation. Our analysis also indicates that the human gut bacteriome is specifically shaped by the virome, which mediates associations with human phenotypic traits. Video Abstract CONCLUSIONS: Our analysis underscores the unique genetic makeup of the gut virome across populations and emphasizes the importance of recognizing gut viral genetic heterogeneity for deeper insights into regional health implications.

Keywords: Bioinformatics; Cohort study; Genomic epidemiology; Metagenomics; Virome.

PubMed Disclaimer

Conflict of interest statement

Declarations: Ethics approval and consent to participate. The study was approved by the Ethics Review Board of Jiangnan University (reference number JNU20220901IRB01) and included 3,234 participants from 30 provinces across Mainland China. Informed consent was obtained from all participants prior to sample collection. Consent for publication: Not applicable. Competing interests: The authors declare no competing interests.

Figures

Fig. 1
Fig. 1
The Chinese Gut Viral Reference (CGVR) extends global viral diversity. a Overview of the CGVR database construction pipeline. A total of 120,568 qualified viral bins were generated, resulting in 72,751 viral operational taxonomic units (vOTUs), including 42,803 high-quality vOTUs. b Distribution of assembled viral genomes based on genomic length and quality. c Summary of viral genomes categorized by completeness (complete, n = 8,667; >90% complete, n = 55,710; 50–90% complete, n = 56,191). d ICTV taxonomic annotation of viral genomes in the CGVR, with numbers indicating the total genomes within each viral class. e Accumulation curves for vOTUs in the CGVR. Red curves include all vOTUs, while blue curves focus on vOTUs (n = 7,441) with at least two conspecific viral sequences. f Functional characterization of novel vOTUs compared with the UHGV catalog, with bars color-coded by broad COG categories, blue for cellular processes and signaling, red for information storage and processing, orange for metabolism, and grey for poorly characterized
Fig. 2
Fig. 2
Population-specific viruses and genomic stratification of predominant viruses linked to bacterial host infection. a Overlap of vOTUs between the CGVR and UHGV based on 95% ANI and 85% AF. b Summary of viral genomes in 233 commonly presented vOTUs across Chinese and non-Chinese populations, defined as having at least 3 genomes in a given vOTU and a minimum of 10% representation in either population. c Characteristics of the 233 commonly presented vOTUs. Red dot plot indicates viral genome length, blue dot plot represents the number of viral CDSs, and green dot plot shows detected SNV sites. d Summary of vOTUs exhibiting significant genomic stratification between Chinese and non-Chinese populations at single-nucleotide resolution. The circular chart shows red for vOTUs with significant population stratification (44 vOTUs, FDR < 0.05) and blue for those without. e Overview of 44 vOTUs with significant genetic stratifications between populations, where green indicates genetic distance between Chinese and non-Chinese populations, red denotes genetic distance within the Chinese population, and blue indicates genetic distance within non-Chinese populations. f Summary of SNVs showing significant enrichment between populations. A total of 4950 loci from 44 vOTUs were analyzed with Fisher’s exact tests, identifying 239 loci that were differentially enriched in either population at FDR < 0.05. g Phylogenetic tree of FeceHeNa7_vae_40 based on SNVs, with red representing viral genomes from Chinese populations and blue from non-Chinese populations. h Genomic annotation of FeceHeNa7_vae_40, highlighting differential SNVs in genes encoding a phage portal protein. Annotated protein-coding genes (arrows) are color-coded by function, including assembly, lysis, integration, replication, and infection. i Phylogenetic tree of FNMHLBE15_vae_111 based on SNVs, with red indicating genomes from Chinese populations and blue indicating non-Chinese populations. j Summary of viral genomes of FNMHLBE15_vae_111 with differential allele types in the OMEACPOP_00016 gene, marked in red for Chinese and blue for non-Chinese genomes. k Predicted protein structure encoded by the OMEACPOP_00016 gene, with differential physicochemical properties indicated by gray dashed lines
Fig. 3
Fig. 3
Predicted bacterial hosts of population-stratified viruses exhibit unique genetic components associated with diverse health-related functions including multidrug resistance. a Predicted virus-bacteria pairs categorized by bacterial hosts at the phylum level. A total of 117 virus-bacteria pairs involves 25 vOTUs and 68 bacterial species. Red indicates viruses, blue indicates bacterial hosts, and gray boxes represent bacterial hosts categorized by phyla, including Bacillota_A and Bacteroidota. b Distinct sub-species clades of FNMHLBE15_vae_111 genomes based on SNV profiles, with blue for clade 1 and red for clade 2. c Genetic dissimilarity of the predicted bacterial host P. distasonis within the two distinct sub-species clades of FNMHLBE15_vae_111, with blue and red indicating within-clade genomic dissimilarities. d Summary of the predicted bacterial host P. distasonis specific to the sub-species clades of FNMHLBE15_vae_111. e Differential allele types within the gene PNJHJHOD_03790_999 (antibiotic ABC transporter ATP-binding and multidrug export protein) in P. distasonis between the sub-species clades of FNMHLBE15_vae_111
Fig. 4
Fig. 4
Genetic diversity within viral species is associated with host phenotypic traits. a Summary of associations between vOTU genomic dissimilarity and host phenotypic differences. Each bar represents a phenotype, sorted by the number of associations (FDR < 0.05), with color coding: red for physiological traits, blue for lifestyle factors, and green for dietary habits. b Genomic dissimilarity of FSHCM57_vae_33 associated with drinking frequency. c Genomic dissimilarity of FGZ37M_FDME220052037-1a_vae_29 related to host age. d Distinct sub-species clades of FeceHuNa150110_FDME220052261-1a_vae_31 based on SNV profiles, with red indicating clade 1 and blue indicating clade 2. e Sex distribution of hosts for FeceHuNa150110_FDME220052261-1a_vae_31, highlighting the differential distribution of viral genomes between male and female hosts
Fig. 5
Fig. 5
Abundance variations of newly characterized vOTUs widely associated with human phenotypes. a Summary of associations between vOTU abundances and various phenotypic traits. Human phenotypes are represented in gray, while gut viruses from different viral classes are color-coded. Each line indicates a significant association between a viral factor and a phenotype, with line colors corresponding to the associated viral class. b The relative abundance of FSHXHZ2_vae_115 is significantly higher in urban populations compared to rural areas. The relative abundance undergoes CLR transformation. c The relative abundance of FSDLZ21_vae_435 shows a positive correlation with host age, indicating that older individuals tend to harbor higher levels of this viral species. The relative abundance undergoes CLR transformation. d The relative abundance of FSDLZ44_vae_143 is negatively associated with defecation frequency, suggesting that increased levels of this viral species may be linked to lower frequency of bowel movements. The relative abundance undergoes CLR transformation. e The relative abundance of FGSYC78_vae_185 is negatively associated with the frequency of bean consumption, indicating that individuals who consume beans more frequently tend to have lower levels of this viral species. The relative abundance undergoes CLR transformation
Fig. 6
Fig. 6
The gut bacteriome composition variations are specific to virome. a Relationship between the number of identified viruses and the number of detected bacterial species per sample. Red density represents the number of detected bacteria, while blue density represents the number of detected viruses. Each dot indicates an individual. b Correlation of overall virome and bacteriome composition based on Bray–Curtis distance calculated using abundance data. Each dot represents the Bray–Curtis distance between pairs of individuals. Regression lines are shown in white. c Summary of viral abundance associations with bacterial abundance confirmed by CRISPR-spacer imprints. The bar plot summarizes the number of vOTUs with significant associations per bacterial species, while the histogram shows the distribution of correlation coefficients based on viral class. d Negative association between FQHXN96_vae_219 and E. coli abundances. The relative abundance undergoes CLR transformation. e Negative association between FSCPS65_vae_71 and B. intestinihominis abundances. The relative abundance undergoes CLR transformation
Fig. 7
Fig. 7
Mediation linkages among the gut virome, bacteriome, and human phenotypes. a Venn diagram showing the number of viruses associated with phenotypes and bacteria, respectively. b Parallel coordinates chart displaying 107 significant mediation effects of gut microbial species. The left panel shows the viral species (vOTUs), the middle panel presents the microbial species, and the right panel illustrates the human phenotypes. Curved lines across the panels indicate mediation effects, with colors representing different phenotypes. c. Microbial species B. uniformis mediates the effect of viral species FHLJYC39_FDME220052688-1a_vae_1390 on age. d Microbial species A. putredinis mediates the effect of viral species FGDLZ19_FDME220051917-1a_vae_242482 on defecation frequency

Similar articles

References

    1. Campbell DE, Ly LK, Ridlon JM, Hsiao A, Whitaker RJ, Degnan PH. Infection with Bacteroides phage BV01 alters the host transcriptome and bile acid metabolism in a common human gut microbe. Cell Rep. 2020;32. - PMC - PubMed
    1. Norman JM, Handley SA, Baldridge MT, Droit L, Liu CY, Keller BC, Kambal A, Monaco CL, Zhao G, Fleshner P, et al. Disease-specific alterations in the enteric virome in inflammatory bowel disease. Cell. 2015;160:447–60. - PMC - PubMed
    1. de Jonge PA, Wortelboer K, Scheithauer TPM, van den Born BH, Zwinderman AH, Nobrega FL, Dutilh BE, Nieuwdorp M, Herrema H. Gut virome profiling identifies a widespread bacteriophage family associated with metabolic syndrome. Nat Commun. 2022;13:3594. - PMC - PubMed
    1. Lang S, Demir M, Martin A, Jiang L, Zhang X, Duan Y, Gao B, Wisplinghoff H, Kasper P, Roderburg C, et al. Intestinal virome signature associated with severity of nonalcoholic fatty liver disease. Gastroenterology. 2020;159:1839–52. - PMC - PubMed
    1. Yang K, Niu J, Zuo T, Sun Y, Xu Z, Tang W, Liu Q, Zhang J, Ng EKW, Wong SKH, et al. Alterations in the gut virome in obesity and type 2 diabetes mellitus. Gastroenterology. 2021;161(4):1257-1269.e1213. - PubMed

LinkOut - more resources