. 2024 Sep 26;15(1):8261.

doi: 10.1038/s41467-024-52598-7.

Decoding the diagnostic and therapeutic potential of microbiota using pan-body pan-disease microbiomics

Georges P Schmartz^#¹, Jacqueline Rehner^#², Madline P Gund^#³, Verena Keller^#⁴, Leidy-Alejandra G Molano¹, Stefan Rupf^{3

5}, Matthias Hannig³, Tim Berger⁶, Elias Flockerzi⁶, Berthold Seitz⁶, Sara Fleser⁷, Sabina Schmitt-Grohé⁷, Sandra Kalefack⁷, Michael Zemlin⁷, Michael Kunz⁸, Felix Götzinger⁸, Caroline Gevaerd⁹, Thomas Vogt⁹, Jörg Reichrath⁹, Lisa Diehl¹, Anne Hecksteden^{10

11}, Tim Meyer¹⁰, Christian Herr¹², Alexey Gurevich^{13

14}, Daniel Krug¹³, Julian Hegemann^{13

15}, Kenan Bozhueyuek¹³, Tobias A M Gulder^{13

15}, Chengzhang Fu¹³, Christine Beemelmanns¹³, Jörn M Schattenberg⁴, Olga V Kalinina¹³, Anouck Becker¹⁶, Marcus Unger¹⁶, Nicole Ludwig¹, Martina Seibert⁶, Marie-Louise Stein⁶, Nikolas Loka Hanna¹², Marie-Christin Martin⁶, Felix Mahfoud⁸, Marcin Krawczyk⁴, Sören L Becker^#², Rolf Müller^#^{13

17}, Robert Bals^#^{12

17}, Andreas Keller^#^{18

19

20}

Affiliations

¹ Clinical Bioinformatics, Saarland University, 66123, Saarbrücken, Germany.
² Institute of Medical Microbiology and Hygiene, Saarland University, 66421, Homburg, Germany.
³ Clinic of Operative Dentistry, Periodontology and Preventive Dentistry, Saarland University, 66421, Homburg, Germany.
⁴ Department of Medicine II, Saarland University Medical Center, 66421, Homburg, Germany.
⁵ Synoptic Dentistry, Saarland University, 66421, Homburg, Germany.
⁶ Department of Ophthalmology, Saarland University Medical Center, 66421, Homburg, Germany.
⁷ Department of General Pediatrics and Neonatology, Saarland University, 66421, Homburg, Germany.
⁸ Department of Internal Medicine III, Cardiology, Angiology, Intensive Care Medicine, Saarland University Hospital, 66421, Homburg, Germany.
⁹ Clinic for Dermatology, Venereology, and Allergology, 66421, Homburg, Germany.
¹⁰ Institute for Sport and Preventive Medicine, Saarland University, 66123, Saarbrücken, Germany.
¹¹ Chair of Sports Medicine, Institute of Physiology, Medical University of Innsbruck, Innsbruck, Austria.
¹² Department of Internal Medicine V - Pulmonology, Allergology, Intensive Care Medicine, Saarland University, Saarbrücken, Germany.
¹³ Helmholtz Institute for Pharmaceutical Research Saarland, 66123, Saarbrücken, Germany.
¹⁴ Center for Bioinformatics Saar and Saarland University, Saarland Informatics Campus, 66123, Saarbrücken, Germany.
¹⁵ Department of Pharmacy, Saarland University, 66123, Saarbrücken, Germany.
¹⁶ Department for Neurology, Saarland University Medical Center, 66421, Homburg, Germany.
¹⁷ PharmaScienceHub, 66123, Saarbrücken, Germany.
¹⁸ Clinical Bioinformatics, Saarland University, 66123, Saarbrücken, Germany. andreas.keller@ccb.uni-saarland.de.
¹⁹ Helmholtz Institute for Pharmaceutical Research Saarland, 66123, Saarbrücken, Germany. andreas.keller@ccb.uni-saarland.de.
²⁰ PharmaScienceHub, 66123, Saarbrücken, Germany. andreas.keller@ccb.uni-saarland.de.

^# Contributed equally.

PMID: 39327438
PMCID: PMC11427559
DOI: 10.1038/s41467-024-52598-7

Decoding the diagnostic and therapeutic potential of microbiota using pan-body pan-disease microbiomics

Georges P Schmartz et al. Nat Commun. 2024.

. 2024 Sep 26;15(1):8261.

doi: 10.1038/s41467-024-52598-7.

Authors

Affiliations

¹ Clinical Bioinformatics, Saarland University, 66123, Saarbrücken, Germany.
² Institute of Medical Microbiology and Hygiene, Saarland University, 66421, Homburg, Germany.
³ Clinic of Operative Dentistry, Periodontology and Preventive Dentistry, Saarland University, 66421, Homburg, Germany.
⁴ Department of Medicine II, Saarland University Medical Center, 66421, Homburg, Germany.
⁵ Synoptic Dentistry, Saarland University, 66421, Homburg, Germany.
⁶ Department of Ophthalmology, Saarland University Medical Center, 66421, Homburg, Germany.
⁷ Department of General Pediatrics and Neonatology, Saarland University, 66421, Homburg, Germany.
⁸ Department of Internal Medicine III, Cardiology, Angiology, Intensive Care Medicine, Saarland University Hospital, 66421, Homburg, Germany.
⁹ Clinic for Dermatology, Venereology, and Allergology, 66421, Homburg, Germany.
¹⁰ Institute for Sport and Preventive Medicine, Saarland University, 66123, Saarbrücken, Germany.
¹¹ Chair of Sports Medicine, Institute of Physiology, Medical University of Innsbruck, Innsbruck, Austria.
¹² Department of Internal Medicine V - Pulmonology, Allergology, Intensive Care Medicine, Saarland University, Saarbrücken, Germany.
¹³ Helmholtz Institute for Pharmaceutical Research Saarland, 66123, Saarbrücken, Germany.
¹⁴ Center for Bioinformatics Saar and Saarland University, Saarland Informatics Campus, 66123, Saarbrücken, Germany.
¹⁵ Department of Pharmacy, Saarland University, 66123, Saarbrücken, Germany.
¹⁶ Department for Neurology, Saarland University Medical Center, 66421, Homburg, Germany.
¹⁷ PharmaScienceHub, 66123, Saarbrücken, Germany.
¹⁸ Clinical Bioinformatics, Saarland University, 66123, Saarbrücken, Germany. andreas.keller@ccb.uni-saarland.de.
¹⁹ Helmholtz Institute for Pharmaceutical Research Saarland, 66123, Saarbrücken, Germany. andreas.keller@ccb.uni-saarland.de.
²⁰ PharmaScienceHub, 66123, Saarbrücken, Germany. andreas.keller@ccb.uni-saarland.de.

^# Contributed equally.

PMID: 39327438
PMCID: PMC11427559
DOI: 10.1038/s41467-024-52598-7

Abstract

The human microbiome emerges as a promising reservoir for diagnostic markers and therapeutics. Since host-associated microbiomes at various body sites differ and diseases do not occur in isolation, a comprehensive analysis strategy highlighting the full potential of microbiomes should include diverse specimen types and various diseases. To ensure robust data quality and comparability across specimen types and diseases, we employ standardized protocols to generate sequencing data from 1931 prospectively collected specimens, including from saliva, plaque, skin, throat, eye, and stool, with an average sequencing depth of 5.3 gigabases. Collected from 515 patients, these samples yield an average of 3.7 metagenomes per patient. Our results suggest significant microbial variations across diseases and specimen types, including unexpected anatomical sites. We identify 583 unexplored species-level genome bins (SGBs) of which 189 are significantly disease-associated. Of note, the existence of microbial resistance genes in one specimen was indicative of the same resistance genes in other specimens of the same patient. Annotated and previously undescribed SGBs collectively harbor 28,315 potential biosynthetic gene clusters (BGCs), with 1050 significant correlations to diseases. Our combinatorial approach identifies distinct SGBs and BGCs, emphasizing the value of pan-body pan-disease microbiomics as a source for diagnostic and therapeutic strategies.

PubMed Disclaimer

Conflict of interest statement

G.P.S., R.M., and A.K. are co-founders of MooH GmbH, a company developing metagenomic based oral health tests. FM is supported by Deutsche Gesellschaft für Kardiologie (DGK), Deutsche Forschungsgemeinschaft (SFB TRR219, Project-ID 322900939), and Deutsche Herzstiftung. His institution (Saarland University) has received scientific support from Ablative Solutions, Medtronic, and ReCor Medical. He has received speaker honoraria/consulting fees from Ablative Solutions, Amgen, Astra-Zeneca, Bayer, Boehringer Ingelheim, Inari, Medtronic, Merck, ReCor Medical, Servier, and Terumo. The remaining authors declare no competing interests.

Figures

**Fig. 1. Study set up, metagenomics data and clinical information.**
a Schematic Workflow describing the sample (upper arrow) and data flow (lower arrow) between clinicians, microbiology, and data science. The clinical data were kept separated from the measurement of microbiomes and only combined after measurement in the computational analysis. b Clinical sampling was focused on seven biospecimens (left blue part). We included patients from a wide range of clinical diseases that allows us analyzing the diagnostic potential of different specimen types across diseases. Created with BioRender.com released under a Creative Commons Attribution-NonCommercial-NoDerivs 4.0 International license. c Sankey plot for the number of samples included in the study at different intervals of the data generation process in relation to our quality control strategy. Specimen types are ordered vertically at each step in the pipeline by frequency of the respective specimen. d Number of reads for each sample colored by specimen. The horizontal line represents the 5 gigabase threshold at a paired-end read length of 150 bp. e Pruned upset plot displaying the most frequent co-occurrence of diseases within the dataset. The combinations are ordered with decreasing frequency, marking the combination of Hypertension and obesity as most common comorbidity in our study. f Ontology used throughout the study grouping diseases by biological systems and separating healthy control from diseased patients. Areas are proportional to the number of patients falling into each category. Patients may be represented multiple times if multiple diseases are diagnosed.

**Fig. 2. Compositional analysis, and link of microbiota to diseases.**
a Two-dimensional Uniform Manifold Approximation and Projection (UMAP) embedding of pairwise computed mash distances, colored by biospecimen of the sample. b Alpha-diversity of all samples, colored by specimen. As a measure of species richness, we selected the Shannon diversity. c Relative genus abundance for each cohort of the second ontology level, divided by biospecimen. Only labels for the 20 most abundant genera are displayed. d Sorted log-fold changes of differentially abundant species matching the visualized results of the next panel. Each panel is split vertically separating positive and negative log-fold changes. e–g Number of differentially abundant species after p-value adjustment of ANCOMBC results revealed during analysis across all cohorts and specimen combinations (q-val <0.05). Numbers in the circles represent the number of specimens included in the respective analysis. h Center-log ratio (CLR) normalized abundance counts of selected species-cohort-specimen combinations. The visualized diseased cohort is indicated by the text above each panel, whereas the selected biospecimen is indicated by the color of the writing. The first row of panels displays potential pathogen candidates with the highest statistical significance and a pathogen score of one. The second row of panels displays saliva samples of commensal bacteria candidates with a commensal score larger than eighteen (min(n) = 50). Boxplot follows Tukey’s style indicating the median as well as the second and third quantiles within boxes. Whiskers extend up to 1.5 times the interquartile range in the presence of outliers.

**Fig. 3. Assembly and resistance gene analysis.**
a Distribution of the number of scaffolds in each sample at various length limits, colored by specimen as box-whisker plot (n = 1931). The boxplot follows a similar style to Fig. 2h. b Sequence of pie charts indicating the presence of emerging antimicrobial resistance genes. Panels are subdivided by genus that was assigned to the contig where resistance genes have been detected. Pie charts scale with the number of measurements in different samples and are colored by the relative frequency of the sample’s biospecimen. c Network visualization of counts of shared antimicrobial resistance (AMR) genes among different biospecimen samples derived from the same patient. Note, any resistance gene annotated by AMRFinderPlus was used for this plot. d Dereplicated SGBs defined from our data. Visualized information includes biospecimen of the initial sample where the SGB was derived from, selected resistance information taken from Pathofact, and effect size of differential coverage analysis for selected cohorts. Note, the visualized differential coverage focuses only on the biospecimen of the initial sample where the SGB has been defined from that is also visualized in the central ring.

**Fig. 4. Evidence-supported genome mining and disease association.**
a Schematic representation of our proposed BGC prioritization strategy representing an adapted version of the BiGMAP workflow. Metagenomic assembly is performed for each sample, followed by BGC prediction. Next, all samples are aligned against all core biosynthetic genes of predicted BGCs. Coverage information is extracted, and downstream analysis is performed. b Volcano plot of the differential BGC coverage analysis results. In this visualization, only matching biospecimen – initial BGC contig combinations are visualized, constituting only a fraction of all results. The unadjusted two-tailed unpaired Wilcoxon test p-values are shown with two horizontal lines representing the 0.05 threshold, both before and after p-value adjustment. c Predicted host species distribution of the assembled DNA fragments where significantly associated core biosynthetic genes reside. Color reflects the number of significant BGCs. d Comparison of the highest correlating effect sizes, comparing differential BGC coverage results between alternative diets and diseases. The effect size of the vegetarian-omnivore comparison is visualized on the y-axis. On the x-axis, the cohort named above the panel is compared against the healthy cohort. For the fourth panel, the minimum effect size across all cohort comparisons is taken for each BGC and compared against the diet comparison.

See this image and copyright information in PMC

References

1. Potrykus, M., Czaja-Stolc, S., Stankiewicz, M., Kaska, L. & Malgorzewicz, S. Intestinal Microbiota as a Contributor to Chronic Inflammation and Its Potential Modifications. Nutrients13, 10.3390/nu13113839 (2021). - PMC - PubMed
1. Kahrstrom, C. T., Pariente, N. & Weiss, U. Intestinal microbiota in health and disease. Nature535, 47 (2016). - PubMed
1. Becker, A. et al. Effects of resistant starch on symptoms, fecal markers, and gut microbiota in parkinson’s disease - The RESISTA-PD Trial. Genomics Proteom. Bioinforma.20, 274–287 (2022). - PMC - PubMed
1. Puschhof, J. & Elinav, E. Human microbiome research: growing pains and future promises. PLoS Biol.21, e3002053 (2023). - PMC - PubMed
1. Katsanos, A. H. et al. in Biomarkers for Endometriosis: State of the Art (ed Thomas D’Hooghe) 41-75 (Springer International Publishing, 2017).

Publication types

Actions

MeSH terms

Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions

Associated data

SRA/PRJNA1057503

LinkOut - more resources

Full Text Sources
- Nature Publishing Group
- PubMed Central

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Decoding the diagnostic and therapeutic potential of microbiota using pan-body pan-disease microbiomics

Affiliations

Decoding the diagnostic and therapeutic potential of microbiota using pan-body pan-disease microbiomics

Authors

Affiliations

Abstract

Conflict of interest statement

Figures

References

Publication types

MeSH terms

Associated data

LinkOut - more resources

Full Text Sources