Gut microbiome-based machine learning model for early colorectal cancer and adenoma screening
- PMID: 41063289
- DOI: 10.1186/s13099-025-00750-z
Gut microbiome-based machine learning model for early colorectal cancer and adenoma screening
Abstract
Colorectal cancer (CRC) is a major source of cancer-related deaths, but early detection at the adenoma stage markedly improves outcomes. Existing tools such as colonoscopy and fecal immunochemical testing (FIT) are invasive or insensitive to early lesions. To develop a non-invasive screening strategy, we analyzed five publicly available 16 S rRNA sequencing datasets from North American and East Asia. Using Analysis of Compositions of Microbiome with Bias Correction (ANCOM-BC) and chi-square testing, we identified 109 discriminatory microbial taxa and trained random forest (RF) classification models to distinguish healthy controls, adenomas, and CRC. The models performed well in internal validation (AUC = 0.90, 95% CI: 0.869-0.931) and external validation (AUC = 0.82), indicating cross-population generalizability. We further developed a microbial risk score (MRS), inspired by polygenic risk score (PRS), methodology, which was significantly elevated in CRC across cohorts. Enrichment of CRC-associated pathogens such as Fusobacterium nucleatum and Porphyromonas gingivalis supports the biological relevance of the findings. These results demonstrate the potential of gut microbiome signatures combined with machine learning as scalable, non-invasive approach for early CRC and adenomas detection.
Keywords: Adenoma; Colorectal cancer; Gut microbiome; Machine learning; Microbial risk score; Non-invasive screening; Random forest.
© 2025. The Author(s).
Conflict of interest statement
Declarations. Ethics approval and consent to participate: This study re-analyzed publicly available, de-identified 16 S rRNA sequencing datasets (accession numbers SRP062005, PRJNA534511, SRP133809, etc.) with associated clinical diagnosis. All original studies had institutional ethics approval and obtained informed consent from participants; no new human or animal experiments were performed. Consent for publication: Not applicable. Competing interests: The authors declare no competing interests.
References
-
- Hossain MS, Karuniawati H, Jairoun AA, Urbi Z, Ooi J, John A, et al. Colorectal cancer: a review of carcinogenesis, global epidemiology, current challenges, risk factors, preventive and treatment strategies. Cancers (Basel). 2022, 14(7): 1732. https://doi.org/10.3390/cancers14071732