This is a preprint.

It has not yet been peer reviewed by a journal.

The National Library of Medicine is running a pilot to include preprints that result from research funded by NIH in PMC and PubMed.

[Preprint]. 2023 Oct 12:2023.10.12.23296829.

doi: 10.1101/2023.10.12.23296829.

Microbiome-based risk prediction in incident heart failure: a community challenge

Pande Putu Erawijantari¹, Ece Kartal², José Liñares-Blanco^{2

3

4}, Teemu D Laajala^{5

6}, Lily Elizabeth Feldman⁶; FINRISK Microbiome DREAM Challenge and ML4Microbiome Communities; Pedro Carmona-Saez^{3

4}, Rajesh Shigdel⁷, Marcus Joakim Claesson^{8

9}, Randi Jacobsen Bertelsen⁷, David Gomez-Cabrero^{10

11}, Samuel Minot¹², Jacob Albrecht¹³, Verena Chung¹³, Michael Inouye^{14

15

16}, Pekka Jousilahti¹⁷, Jobst-Hendrik Schultz¹⁸, Hans-Christoph Friederich¹⁸, Rob Knight^{19

20

21

22}, Veikko Salomaa¹⁷, Teemu Niiranen^{17

23

24}, Aki S Havulinna^{17

25}, Julio Saez-Rodriguez², Rebecca T Levinson^{2

18}, Leo Lahti¹

Affiliations

¹ Department of Computing, Faculty of Technology, University of Turku, Turku, Finland.
² Heidelberg University, Faculty of Medicine, and Heidelberg University Hospital, Institute for Computational Biomedicine, Bioquant, Heidelberg, Germany.
³ GENYO. Centre for Genomics and Oncological Research: Pfizer, University of Granada, Andalusian Regional Government, PTS Granada, Avenida de la Ilustración 114, 18016, Granada, Spain.
⁴ Department of Statistics and Operations Research, University of Granada, Spain.
⁵ Department of Mathematics and Statistics, Faculty of Science, University of Turku, Finland.
⁶ Department of Pharmacology, University of Colorado Anschutz Medical Campus, Aurora, CO 80045, USA.
⁷ Department of Clinical Science, University of Bergen, Bergen, Norway.
⁸ APC Microbiome Ireland, University College Cork, T12 YT20 Cork, Ireland.
⁹ School of Microbiology, University College Cork, T12 YT20 Cork, Ireland.
¹⁰ Translational Bioinformatics Unit, Navarrabiomed, Public University of Navarra, IDISNA, Pamplona, Spain.
¹¹ Biological and Environmental Sciences & Engineering Division, King Abdullah University of Science & Technology, Thuwal, Kingdom of Saudi Arabia.
¹² Data Core, Shared Resources, Fred Hutchinson Cancer Center. Seattle, WA. USA.
¹³ Sage Bionetworks, Seattle, WA. USA.
¹⁴ Cambridge Baker Systems Genomics Initiative, Baker Heart & Diabetes Institute, Melbourne, Victoria, Australia.
¹⁵ Cambridge Baker Systems Genomics Initiative, Department of Public Health and Primary Care, Cambridge University, Cambridge, UK.
¹⁶ Victor Phillip Dahdaleh Heart and Lung Research Institute, University of Cambridge, Cambridge, UK.
¹⁷ Department of Public Health and Welfare, Finnish Institute for Health and Welfare, Helsinki, Finland.
¹⁸ Department of General Internal Medicine & Psychosomatics, Heidelberg University Hospital, Heidelberg, Germany.
¹⁹ Jacobs School of Engineering, University of California San Diego, La Jolla, CA. USA.
²⁰ Center for Microbiome Innovation, University of California San Diego, La Jolla, CA. USA.
²¹ Department of Pediatrics, School of Medicine, University of California San Diego, La Jolla, CA. USA.
²² Department of Computer Science & Engineering, University of California San Diego, La Jolla, CA. USA.
²³ Division of Medicine, Turku University Hospital, Turku, Finland.
²⁴ Department of Internal Medicine, University of Turku, Turku, Finland.
²⁵ Institute for Molecular Medicine Finland, Helsinki Institute of Life Science, Helsinki, Finland.

PMID: 37873403
PMCID: PMC10593042
DOI: 10.1101/2023.10.12.23296829

Microbiome-based risk prediction in incident heart failure: a community challenge

Pande Putu Erawijantari et al. medRxiv. 2023.

[Preprint]. 2023 Oct 12:2023.10.12.23296829.

doi: 10.1101/2023.10.12.23296829.

Authors

Affiliations

¹ Department of Computing, Faculty of Technology, University of Turku, Turku, Finland.
² Heidelberg University, Faculty of Medicine, and Heidelberg University Hospital, Institute for Computational Biomedicine, Bioquant, Heidelberg, Germany.
³ GENYO. Centre for Genomics and Oncological Research: Pfizer, University of Granada, Andalusian Regional Government, PTS Granada, Avenida de la Ilustración 114, 18016, Granada, Spain.
⁴ Department of Statistics and Operations Research, University of Granada, Spain.
⁵ Department of Mathematics and Statistics, Faculty of Science, University of Turku, Finland.
⁶ Department of Pharmacology, University of Colorado Anschutz Medical Campus, Aurora, CO 80045, USA.
⁷ Department of Clinical Science, University of Bergen, Bergen, Norway.
⁸ APC Microbiome Ireland, University College Cork, T12 YT20 Cork, Ireland.
⁹ School of Microbiology, University College Cork, T12 YT20 Cork, Ireland.
¹⁰ Translational Bioinformatics Unit, Navarrabiomed, Public University of Navarra, IDISNA, Pamplona, Spain.
¹¹ Biological and Environmental Sciences & Engineering Division, King Abdullah University of Science & Technology, Thuwal, Kingdom of Saudi Arabia.
¹² Data Core, Shared Resources, Fred Hutchinson Cancer Center. Seattle, WA. USA.
¹³ Sage Bionetworks, Seattle, WA. USA.
¹⁴ Cambridge Baker Systems Genomics Initiative, Baker Heart & Diabetes Institute, Melbourne, Victoria, Australia.
¹⁵ Cambridge Baker Systems Genomics Initiative, Department of Public Health and Primary Care, Cambridge University, Cambridge, UK.
¹⁶ Victor Phillip Dahdaleh Heart and Lung Research Institute, University of Cambridge, Cambridge, UK.
¹⁷ Department of Public Health and Welfare, Finnish Institute for Health and Welfare, Helsinki, Finland.
¹⁸ Department of General Internal Medicine & Psychosomatics, Heidelberg University Hospital, Heidelberg, Germany.
¹⁹ Jacobs School of Engineering, University of California San Diego, La Jolla, CA. USA.
²⁰ Center for Microbiome Innovation, University of California San Diego, La Jolla, CA. USA.
²¹ Department of Pediatrics, School of Medicine, University of California San Diego, La Jolla, CA. USA.
²² Department of Computer Science & Engineering, University of California San Diego, La Jolla, CA. USA.
²³ Division of Medicine, Turku University Hospital, Turku, Finland.
²⁴ Department of Internal Medicine, University of Turku, Turku, Finland.
²⁵ Institute for Molecular Medicine Finland, Helsinki Institute of Life Science, Helsinki, Finland.

PMID: 37873403
PMCID: PMC10593042
DOI: 10.1101/2023.10.12.23296829

Abstract

Heart failure (HF) is a major public health problem. Early identification of at-risk individuals could allow for interventions that reduce morbidity or mortality. The community-based FINRISK Microbiome DREAM challenge (synapse.org/finrisk) evaluated the use of machine learning approaches on shotgun metagenomics data obtained from fecal samples to predict incident HF risk over 15 years in a population cohort of 7231 Finnish adults (FINRISK 2002, n=559 incident HF cases). Challenge participants used synthetic data for model training and testing. Final models submitted by seven teams were evaluated in the real data. The two highest-scoring models were both based on Cox regression but used different feature selection approaches. We aggregated their predictions to create an ensemble model. Additionally, we refined the models after the DREAM challenge by eliminating phylum information. Models were also evaluated at intermediate timepoints and they predicted 10-year incident HF more accurately than models for 5- or 15-year incidence. We found that bacterial species, especially those linked to inflammation, are predictive of incident HF. This highlights the role of the gut microbiome as a potential driver of inflammation in HF pathophysiology. Our results provide insights into potential modeling strategies of microbiome data in prospective cohort studies. Overall, this study provides evidence that incorporating microbiome information into incident risk models can provide important biological insights into the pathogenesis of HF.

PubMed Disclaimer

Conflict of interest statement

Conflict of Interest Illumina, Inc., and Janssen Pharmaceutica provided additional support by sponsoring the Center for Microbiome Innovation at the University of California San Diego. T.N. has received honoraria for speaking engagements from Servier and AstraZeneca. V.S. has had research collaboration with Bayer AG, unrelated to this study. J.S.-R. has received funding from GSK, Pfizer and Sanofi, and fees/honoraria from Travere Therapeutics, Stadapharm, Astex, Pfizer and Grunenthal. M.I. is a trustee of the Public Health Genomics (PHG) Foundation, a member of the Scientific Advisory Board of Open Targets, and has a research collaboration with AstraZeneca unrelated to this study. R.K. is a cofounder of Micronoma and Biota, holding stock for Gencirq, Cybele, Biomesense, Micronoma, and Biota, serve as a member of the Scientific Advisory Board in Gencirq, DayTwo, Biomesense, and Micronoma and serve as consultant for DayTwo, Cybele, and Biomesense.

Figures

**Figure 1.**
Overview of the DREAM Challenge and FINRISK data. A. Geographical distribution across Finland for the individuals within the national FINRISK 2002 cohort. B. Principal Coordinate Analysis (PCoA) using Bray- Curtis dissimilarity metrics between randomly selected subsets of the data (training, testing, scoring sets). C. The setup and timeline of the DREAM Challenge including submission and scoring phases.

**Figure 2.**
Harrell’s C and Hosmer-Lemeshow test A. Harrell’s C-index and Hosmer-Lemeshow p-value were obtained for the investigated models, including the three baseline models provided by the organizers in the scoring phase. **B-C.** Harrell’s C-index and Hosmer-Lemeshow empirical p-value on 1000 bootstrapped iterations for all the models. We used blue for SB2, orange for DFH and purple for the baseline models. D. Selected features in the baseline and top models. *Taxonomic features in the “Baseline All” model are presented in Supp. Table 6. ** The features and modules selected by DFH model were weight-based from 10 different seeds. Features present in each model are represented by turquoise-filled squares, while absence is indicated by blank squares.

**Figure 3.**
Schematic illustration of modeling workflow of the two top-performing teams. A. Team DFH used modular Elastic Net regularized Cox proportional hazards model. After manually curating interpretable modules, they identified the optimal features within each module by module-specific cross-validation. The pruned modules were then combined and used to identify the best overall combination of features using cross-validation. The team averaged final risk predictions across multiple seeds. B. The SB2 team used LASSO regularization to retain 29 features encompassing age, BMI, systolic blood pressure, non-HDL cholesterol, sex, and dysbiosis as unpenalized features, and blood pressure treatment, prevalent diabetes, smoking and prevalent coronary heart disease were penalized and selected by LASSO to be included in the final Cox proportional hazards model.

**Figure 4.**
A.Harrell’s C-index and B. Hosmer-Lemeshow p-value for the ensemble models from mean-aggregations of the final model’s individual risk score. The lower plot illustrates the combination of teams utilized in the calculation of the mean for the aggregated final models. The dashed line corresponds to p-value=0.05 on the y-axis (B), while the x-axis represents different combinations of ensemble models.

**Figure 5.**
Evaluation of model performance over varying follow-up times. A. Harrell’s C-index and B. Hosmer-Lemeshow p-values are presented for different models, distinguished by unique colors, across three distinct follow-up periods: 5, 10, and 15 years. Two distinct HF definitions were represented in different shapes.

See this image and copyright information in PMC

References

1. Vaduganathan M., Mensah G. A., Turco J. V., Fuster V. & Roth G. A. The Global Burden of Cardiovascular Diseases and Risk: A Compass for Future Health. J. Am. Coll. Cardiol. 80, 2361–2371 (2022). - PubMed
1. Savarese G. et al. Global burden of heart failure: a comprehensive and updated review of epidemiology. Cardiovasc. Res. 118, 3272–3287 (2023). - PubMed
1. Sandhu A. T. et al. Disparity in the Setting of Incident Heart Failure Diagnosis. Circ. Heart Fail. (2021) doi: 10.1161/CIRCHEARTFAILURE.121.008538. - DOI - PMC - PubMed
1. Bayes-Genis A. et al. Omics phenotyping in heart failure: the next frontier. Eur. Heart J. 41, 3477–3484 (2020). - PubMed
1. Chandramouli C., Stewart S., Almahmeed W. & Lam C. S. P. Clinical implications of the universal definition for the prevention and treatment of heart failure. Clin. Cardiol. 45 Suppl 1, S2–S12 (2022). - PMC - PubMed

Publication types

Actions

Grants and funding

T32 GM007635/GM/NIGMS NIH HHS/United States

LinkOut - more resources

Full Text Sources
Research Materials
- NCI CPTC Antibody Characterization Program
Miscellaneous
- NCI CPTAC Assay Portal

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

This is a preprint.

Microbiome-based risk prediction in incident heart failure: a community challenge

Affiliations

Microbiome-based risk prediction in incident heart failure: a community challenge

Authors

Affiliations

Abstract

Conflict of interest statement

Figures

References

Publication types

Grants and funding

LinkOut - more resources

Full Text Sources

Research Materials

Miscellaneous