Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2022 Sep 6;13(1):5252.
doi: 10.1038/s41467-022-32832-w.

Extensive gut virome variation and its associations with host and environmental factors in a population-level cohort

Affiliations

Extensive gut virome variation and its associations with host and environmental factors in a population-level cohort

Suguru Nishijima et al. Nat Commun. .

Abstract

Indigenous bacteriophage communities (virome) in the human gut have a huge impact on the structure and function of gut bacterial communities (bacteriome), but virome variation at a population scale is not fully investigated yet. Here, we analyse the gut dsDNA virome in the Japanese 4D cohort of 4198 deeply phenotyped individuals. By assembling metagenomic reads, we discover thousands of high-quality phage genomes including previously uncharacterised phage clades with different bacterial hosts than known major ones. The distribution of host bacteria is a strong determinant for the distribution of phages in the gut, and virome diversity is highly correlated with anti-viral defence mechanisms of the bacteriome, such as CRISPR-Cas and restriction-modification systems. We identify 97 various intrinsic/extrinsic factors that significantly affect the virome structure, including age, sex, lifestyle, and diet, most of which showed consistent associations with both phages and their predicted bacterial hosts. Among the metadata categories, disease and medication have the strongest effects on the virome structure. Overall, these results present a basis to understand the symbiotic communities of bacteria and their viruses in the human gut, which will facilitate the medical and industrial applications of indigenous viruses.

PubMed Disclaimer

Conflict of interest statement

The authors declare no competing interests.

Figures

Fig. 1
Fig. 1. Overview of reconstructed phage genomes from 4198 human gut metagenomes.
a Genome size and GC content of phage genomes (n = 4709) reconstructed from the dataset of 4,198 whole metagenomes. Bar plots on the top and right side depict the distribution of genome size and GC content, respectively. b Taxonomic annotation of the vOTUs at the family level. c Predicted hosts of the vOTUs at the phylum level. d Number of predicted hosts at the genus level for each vOTU. e Rarefaction curves of the detected vOTUs and VCs (mean number) in this cohort (n = 4,198). Shadows of the lines represent 95% confidence intervals. f Number of vOTUs, genome size, taxonomy, the ratio of specialist and generalist phages, and the ratio of virulent and temperate phages for each predicted host (at the genus level). If a vOTU was predicted to infect more than one genus, it was distributed to all the predicted genera. Thirty-three genera predicted to be infected with more than 5 vOTUs are shown in the figure. In boxplots, boxes represent the interquartile range (IQR), and the lines inside show the median. Whiskers denote the lowest and highest values within 1.5 times the IQR.
Fig. 2
Fig. 2. Identification of novel viral clusters abundant and prevalent in the human gut.
a Average abundance of each VC among the 4198 individuals and the number of genomes forming the VC. The colour and size of each circle represent the phylum-level taxonomy of the predicted host and prevalence of the VC in the cohort, respectively. b Similarity of phage genomic content (proportion of shared proteins) visualised by tSNE. The colour of each circle represents the VC assigned to the genome. c Phylogenetic trees of the 10 most abundant and prevalent VCs in this cohort and phages in the RefSeq database were constructed based on large terminase proteins. Circles on the edges show vOTUs belonging to the VC and edges without circles represent reference genomes in RefSeq. Only representative genomes for each vOTU are included in the trees.
Fig. 3
Fig. 3. Close interactions between the gut virome and bacteriome.
Comparisons of α-diversity (Shannon index) (a) and average β-diversity (Bray-Curtis distance) (b) between the virome and bacteriome of the human gut (n = 4,198 individuals). In b each circle represents the average value of the Bray-Curtis distance against other individuals. Regression lines are indicated in red. vOTU level and species level profiles of the virome and bacteriome were used, respectively. c Spearman correlations between relative abundances of vOTUs and predicted hosts at the genus level. Blue and orange circles represent specialist phages (i.e. phages predicted to infect only one genus) and generalist phages (i.e. phages predicted to infect more than one genus), respectively. Twenty-two genera (average relative abundance > 0.1%) with at least 5 vOTUs are shown in the figure. d Distribution of phage-host correlations across the 4,198 individuals. Blue, orange, and green colours represent the distributions of all phages, specialist phages, and generalist phages, respectively. Dashed lines show the average correlation in the distribution. e Comparison of the Shannon index of the virome among groups of individuals with high, medium, and low abundances of defence genes in the metagenomic data. The 4,198 individuals were placed into the three groups based on tertiles of the total abundances for the anti-phage systems. Asterisks represent statistical significance (P < 0.05, Wilcoxon rank-sum test, two-sided). f Heatmap summarising Spearman correlations between relative abundance of the anti-phage systems and the Shannon index of the virome. Asterisks represent statistical significance (P < 0.05). In boxplots, boxes represent the interquartile range (IQR), and the lines inside show the median. Whiskers denote the lowest and highest values within 1.5 times the IQR. Spearman correlation and its statistical significant were calculated using the cor.test function in R (two-sided).
Fig. 4
Fig. 4. Age- and sex-related changes in the human gut virome.
a Correlation between age/sex and the Shannon index of the virome. Individuals younger than 20 years (n = 2) and older than 80 years (n = 6) were excluded from the plot due to the low numbers of individuals. Spearman correlation and its statistical significant was calculated using the cor.test function in R (two-sided). b Average relative abundance of phages grouped according to the phylum-level taxonomy of host bacterial species. Each plot represents the average (mean) relative abundance of all the phages predicted to infect the phylum at each age. Red and blue asterisks show significant positive and negative correlations, respectively (*FDR < 0.01, **FDR < 0.001). Error bars represent standard erros. c, d Associations of the viral profile with age (c) and sex (d). X- and Y-axes show the effect size and log10-transformed P-values obtained from multivariable regression analysis, respectively. The circles and triangles in the plots represent vOTUs and VCs, respectively. No correction for multiple testing was performed here since only variables with FDR < 0.05 in the univariate regression analysis were included in the multivariable regression analysis (Methods). In boxplots, boxes represent the interquartile range (IQR), and the lines inside show the median. Whiskers denote the lowest and highest values within 1.5 times the IQR.
Fig. 5
Fig. 5. Host and environmental factors significantly associated with the gut virome.
a Proportion of the variance explained by each metadata category in the gut virome and bacteriome (n = 4,198 individuals) assessed by redundancy analysis. b Distance-based redundancy analysis showing the association between the 4,198 virome profiles at the vOTU level and collected metadata. Gray plots show each viral profile, and arrows represent associations with the metadata. The length of the arrow indicates the strength of the association, and the colour represents the metadata category. The top 20 non-redundant metadata with the strongest associations are shown in the plot. c Positive correlations between the explained variance of the gut virome and bacteriome by each metadata. Each point represents metadata and its colour indicates the metadata category. The blue line represents the regression line, and the grey shadow shows 95% confidence interval. Spearman correlation and its statistical significant was calculated using the cor.test function in R (two-sided). d Proportion of the explained variance in the gut virome and bacteriome by each metadata assessed by permutational analysis of variance (FDR < 0.05). The bottom heatmap shows adjusted P-values for associations between the metadata and the virome diversity assessed by multivariable regression analysis adjusting for other covariates. Red and blue colors represent positive and negative associations, respectively. No correction for multiple testing was performed here since only variables with FDR < 0.05 in the univariate regression analysis were included in the multivariable regression analysis (Methods). For medication, the top 20 drugs with the strongest associations with the virome are shown. Abbreviations: BSS, Bristol stool scale; DM, diabetes mellitus; IBD, inflammatory bowel disease; CRC, colorectal cancer; ACS, acute coronary syndrome; PH, past history; GI, gastrointestinal tract.

References

    1. Shkoporov AN, Hill C. Bacteriophages of the human gut: the ‘Known Unknown’ of the microbiome. Cell Host Microbe. 2019;25:195–209. doi: 10.1016/j.chom.2019.01.017. - DOI - PubMed
    1. De Sordi L, Lourenço M, Debarbieux L. The battle within: interactions of bacteriophages and bacteria in the gastrointestinal tract. Cell Host Microbe. 2019;25:210–218. doi: 10.1016/j.chom.2019.01.018. - DOI - PubMed
    1. Federici S, Nobs SP, Elinav E. Phages and their potential to modulate the microbiome and immunity. Cell. Mol. Immunol. 2020 doi: 10.1038/s41423-020-00532-4. - DOI - PMC - PubMed
    1. Kleiner M, Bushnell B, Sanderson KE, Hooper LV, Duerkop BA. Transductomics: sequencing-based detection and analysis of transduced DNA in pure cultures and microbial communities. Microbiome. 2020;8:1–17. doi: 10.1186/s40168-020-00935-5. - DOI - PMC - PubMed
    1. Dutilh BE, et al. A highly abundant bacteriophage discovered in the unknown sequences of human faecal metagenomes. Nat. Commun. 2014;5:4498. doi: 10.1038/ncomms5498. - DOI - PMC - PubMed

Publication types