Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2022 Jan 3;22(1):4.
doi: 10.1186/s12866-021-02414-9.

Determine independent gut microbiota-diseases association by eliminating the effects of human lifestyle factors

Affiliations

Determine independent gut microbiota-diseases association by eliminating the effects of human lifestyle factors

Congmin Zhu et al. BMC Microbiol. .

Abstract

Lifestyle and physiological variables on human disease risk have been revealed to be mediated by gut microbiota. Low concordance between case-control studies for detecting disease-associated microbe existed due to limited sample size and population-wide bias in lifestyle and physiological variables. To infer gut microbiota-disease associations accurately, we propose to build machine learning models by including both human variables and gut microbiota. When the model's performance with both gut microbiota and human variables is better than the model with just human variables, the independent gut microbiota -disease associations will be confirmed. By building models on the American Gut Project dataset, we found that gut microbiota showed distinct association strengths with different diseases. Adding gut microbiota into human variables enhanced the classification performance of IBD significantly; independent associations between occurrence information of gut microbiota and irritable bowel syndrome, C. difficile infection, and unhealthy status were found; adding gut microbiota showed no improvement on models' performance for diabetes, small intestinal bacterial overgrowth, lactose intolerance, cardiovascular disease. Our results suggested that although gut microbiota was reported to be associated with many diseases, a considerable proportion of these associations may be very weak. We proposed a list of microbes as biomarkers to classify IBD and unhealthy status. Further functional investigations of these microbes will improve understanding of the molecular mechanism of human diseases.

Keywords: Disease classification; Gut microbiota; Human variables; Machine learning.

PubMed Disclaimer

Conflict of interest statement

The authors declare that they have no competing interests.

Figures

Fig. 1
Fig. 1
Workflow of disease classification models construction. We classified eight diseases (IBD: Inflammatory Bowel Disease; CDI: C. difficile Infection; IBS: Irritable Bowel Syndrome; SIBO: Small Intestinal Bacterial Overgrowth; DI: Diabetes; LI: Lactose Intolerance; CD: Cardiovascular Disease; MD: Mental Disorder) with 30 human variables (physiological characteristics, lifestyle, location, and diet) and gut microbial community data (OTUs) obtained from the American Gut Project database using four machine learning techniques (Random Forest, Gradient Boosting Decision Tree, Logistic Regression and eXtreme Gradient Boosting). We propose to build association models by including both human variables and gut microbiota, and assumed that when the performance of the model with both gut microbiota and human variables is better than the model with just human variables, the independent association of gut microbiota with the disease can be confirmed
Fig. 2
Fig. 2
Comparing AUC values of nine diseases using five feature types
Fig. 3
Fig. 3
Feature distribution for the best model with the highest AUC. Different features are marked with various colors and shapes. OTUs are annotated at the order level. In all subgraphs, the orders of host variables and OTUs are fixed and unified, and OTUs are sorted according to their average sizes reversely
Fig. 4
Fig. 4
Performances of four machine learning methods in different characteristics and disease prediction. The color of the open circle represents different machine learning methods, and the size represents the standard deviation
Fig. 5
Fig. 5
Changes in the AUC of the optimal model with the number of OTUs. The optimal model for four diseases (IBD: Inflammatory Bowel Disease; IBS: Irritable Bowel Syndrome; DI: Diabetes; UH: Unhealthy status)

References

    1. Suau A, Bonnet R, Sutren M, Godon JJ, Gibson GR, Collins MD, et al. Direct analysis of genes encoding 16S rRNA from complex communities reveals many novel molecular species within the human gut. Appl Environ Microbiol. 1999;65:4799–4807. - PMC - PubMed
    1. Foster JA, McVey Neufeld KA. Gut-brain axis: how the microbiome influences anxiety and depression. Trends Neurosci. 2013;36:305–312. - PubMed
    1. Chu H, Khosravi A, Kusumawardhani IP, Kwon AH, Vasconcelos AC, Cunha LD, et al. Gene-microbiota interactions contribute to the pathogenesis of inflammatory bowel disease. Science (New York, NY) 2016;352:1116–1120. - PMC - PubMed
    1. Samarkos M, Mastrogianni E, Kampouropoulou O. The role of gut microbiota in Clostridium difficile infection. Eur J Intern Med. 2018;50:28–32. - PubMed
    1. Pedersen HK, Gudmundsdottir V, Nielsen HB, Hyotylainen T, Nielsen T, Jensen BA, et al. Human gut microbes impact host serum metabolome and insulin sensitivity. Nature. 2016;535:376–381. - PubMed

Publication types

LinkOut - more resources