Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2023 Nov;29(11):2785-2792.
doi: 10.1038/s41591-023-02599-8. Epub 2023 Nov 2.

Bacterial SNPs in the human gut microbiome associate with host BMI

Affiliations

Bacterial SNPs in the human gut microbiome associate with host BMI

Liron Zahavi et al. Nat Med. 2023 Nov.

Abstract

Genome-wide association studies (GWASs) have provided numerous associations between human single-nucleotide polymorphisms (SNPs) and health traits. Likewise, metagenome-wide association studies (MWASs) between bacterial SNPs and human traits can suggest mechanistic links, but very few such studies have been done thus far. In this study, we devised an MWAS framework to detect SNPs and associate them with host phenotypes systematically. We recruited and obtained gut metagenomic samples from a cohort of 7,190 healthy individuals and discovered 1,358 statistically significant associations between a bacterial SNP and host body mass index (BMI), from which we distilled 40 independent associations. Most of these associations were unexplained by diet, medications or physical exercise, and 17 replicated in a geographically independent cohort. We uncovered BMI-associated SNPs in 27 bacterial species, and 12 of them showed no association by standard relative abundance analysis. We revealed a BMI association of an SNP in a potentially inflammatory pathway of Bilophila wadsworthia as well as of a group of SNPs in a region coding for energy metabolism functions in a Faecalibacterium prausnitzii genome. Our results demonstrate the importance of considering nucleotide-level diversity in microbiome studies and pave the way toward improved understanding of interpersonal microbiome differences and their potential health implications.

PubMed Disclaimer

Figures

Extended Data Fig. 1 ∣
Extended Data Fig. 1 ∣. SNPs overview.
(a) Distribution of the 12,686,191 detected SNPs across 348 species. (b) Number of samples covering different SNPs.
Extended Data Fig. 2 ∣
Extended Data Fig. 2 ∣. Volcano plot.
Volcano plot shows for each SNP the difference between the average BMI in individuals with mostly the alternative allele (major allele frequency ≤ 0.5) and the average BMI in individuals with mostly the major allele (major allele frequency > 0.5; x-axis); and its p-value (y-axis). Red annotations show gene symbols of the protein-coding SNPs left after the clumping stage (if a gene symbol exists). X-axis was truncated to the range of statistically significant associations ±10%.
Extended Data Fig. 3 ∣
Extended Data Fig. 3 ∣. BMI differences.
For each of the 40 BMI-associated SNPs that remained after the clumping stage, boxplots (center, median; box, interquartile range; whiskers, 5th and 95th percentiles; notches, 95% confidence interval around the median based on 1,000 times bootstrap) compare host BMI distribution of individuals with no bacteria of this species (left box; Methods), hosts of bacteria with the major allele (middle box; major allele frequency ≥ 0.99) and hosts of bacteria with the minor allele (right box; major allele frequency ≤ 0.01). The grey scale indicates the difference between medians. Groups were compared in a two-sided Mann-Whitney test, and p-values were Bonferroni corrected for 120 hypotheses (40 SNPs, 3 comparisons per SNP).
Extended Data Fig. 4 ∣
Extended Data Fig. 4 ∣. Quantile-quantile (Q-Q) plots.
Expected (uniform distribution between 1/[the total number of tested SNPs] and 1) p-values compared to the SNPs p-values estimated in the MWAS analysis. (a) All tested SNPs. Red dots are the 40 BMI-associated SNPs remaining after the clumping procedure. (b) Each species estimated and plotted separately using a random color. Straight lines connect adjacent SNP dots to increase readability. (c) Species with more than 13 BMI-associated SNPs. Straight lines connect adjacent SNP dots to increase readability.
Extended Data Fig. 5 ∣
Extended Data Fig. 5 ∣. Number of correlated SNPs in each linkage group.
Histograms show the number of correlated SNPs that were found in the clumping stage in each linkage group. The total number of groups is 40, which is the final number of SNPs that remained post the clumping procedure. (a) Full range of group sizes. (b) Groups with 1 to 100 SNPs.
Extended Data Fig. 6 ∣
Extended Data Fig. 6 ∣. Power analysis.
Boxplots (center, median; box, interquartile range; whiskers, 1.5 * interquartile range or the most extreme data point) show the calculated power for associating the 40 SNPs with BMI, given the effect size observed in our cohort and various effective sample sizes (N). Alpha was set to 3.9 × 10−9 based on a cutoff of 0.05 and a Bonferroni correction for 12,686,191 hypotheses.
Extended Data Fig. 7 ∣
Extended Data Fig. 7 ∣. Random replication control.
For 1000 random choices of 40 SNPs from the discovery analysis, showing how many passed the 0.05 Bonferroni adjusted cutoff for association with BMI in the replication cohort. For reference, the red dotted line shows the number of SNPs that passed the cutoff when the 40 SNPs that were associated with BMI in the discovery cohort were tested – 17.
Extended Data Fig. 8 ∣
Extended Data Fig. 8 ∣. Replication cohort characteristics.
Age, sex, and BMI distribution of the 8,204 study participants.
Fig. 1 ∣
Fig. 1 ∣. Study overview.
a, illustration of the study design. b, Age, sex and BMI distribution of the study participants. The purple and orange lines in the right panel show the trend of the age–BMI relation for females and males, respectively. The P value of the slope is 10−17 for the purple and 10−10 for the orange lines.
Fig. 2 ∣
Fig. 2 ∣. Bacterial SNPs associate with host BMI.
a, Manhattan plot showing the P value of each SNP’s association with BMI. SNPs are sorted along the x axis based on taxonomy. The red dashed line marks the Bonferroni-adjusted 0.05 P value threshold = 3.94 × 10−9. SNPs that were excluded in the clumping stage are colored in light gray. Red annotations show gene symbols of the protein-coding SNPs left after the clumping stage (if a gene symbol exists). b, For the SNPs with the smallest P value (left two) or largest difference between allele groups (right two) out of the SNPs that were not filtered in the clumping stage, box plots (center, median; box, interquartile range; whiskers, 5th and 95th percentiles; notches, 95% confidence interval around the median based on 1,000 times bootstrap) compare host BMI distribution of individuals with no bacteria of this species (Methods), hosts of bacteria with the major allele (major allele frequency ≥ 0.99) and hosts of bacteria with the minor allele (major allele frequency ≤ 0.01). The gray line indicates the difference between medians. Groups were compared in a two-sided Mann–Whitney test, and P values were Bonferroni corrected for 120 hypotheses (40 SNPs, three comparisons per SNP). All 40 SNPs are shown in Extended Data Fig. 3. The q values in the titles are the Bonferroni-adjusted P values of SNPs in the original MWAS regression. NS, not significant.
Fig. 3 ∣
Fig. 3 ∣. Number of BMI-associated SNPs per species.
Bar height and black numbers show the number of SNPs achieving the Bonferroni-adjusted 0.05 significance cutoff for association with BMI. White numbers show the number of associations retained after the clumping procedure. Species with no BMI-associated SNPs are not shown.
Fig. 4 ∣
Fig. 4 ∣. Comparison of MWAS results and relative abundance analysis.
a, Pie plot shows the fraction of bacterial species not correlated with BMI by species relative abundance out of the 27 species in which we found BMI-associated SNPs. b, Pie plot shows which of the 40 BMI-associated SNPs are in species associated with BMI by relative abundance.
Fig. 5 ∣
Fig. 5 ∣. Results replicate in a geographically independent cohort.
Comparison of each SNP’s estimated coefficient (center) and 95% confidence interval (bars) in the MWAS regression in the discovery (x axis) and the replication (y axis) cohorts. SNPs are colored according to whether their Bonferroni-adjusted P value in the replication cohort is below 0.05. SNPs in the upper-right and lower-left quarters have the same correlation directionality in both cohorts.
Fig. 6 ∣
Fig. 6 ∣. BMI-associated SNPs in Rep_3066.
Top plot, a fraction of the Manhattan plot from Fig. 2a, zoomed-in to show a region of Rep_3066, contig 257, where there are SNPs significantly associated with BMI. SNPs are plotted according to their genomic position (x axis) and P value (y axis). The red dashed line marks the Bonferroni-adjusted 0.05 P value threshold. In the clumping procedure, the SNP with the smallest P value was retained, and other significantly associated SNPs were filtered out. Bottom plot, position of predicted ORFs in the shown genomic region, colored based on their predicted function.

References

    1. Lynch SV & Pedersen O The human intestinal microbiome in health and disease. N. Engl. J. Med 375, 2369–2379 (2016). - PubMed
    1. Manichanh C. et al. Reduced diversity of faecal microbiota in Crohn’s disease revealed by a metagenomic approach. Gut 55, 205–211 (2006). - PMC - PubMed
    1. Tang WHW et al. Intestinal microbial metabolism of phosphatidylcholine and cardiovascular risk. N. Engl. J. Med 368, 1575–1584 (2013). - PMC - PubMed
    1. Turnbaugh PJ et al. A core gut microbiome in obese and lean twins. Nature 457, 480–484 (2009). - PMC - PubMed
    1. Karlsson FH et al. Gut metagenome in European women with normal, impaired and diabetic glucose control. Nature 498, 99–103 (2013). - PubMed