Consensus clustering applied to multi-omics disease subtyping

doi:10.1186/s12859-021-04279-1

. 2021 Jul 6;22(1):361.

doi: 10.1186/s12859-021-04279-1.

Consensus clustering applied to multi-omics disease subtyping

Galadriel Brière^{1

2}, Élodie Darbo^{3

4}, Patricia Thébault^#³, Raluca Uricaru^#³

Affiliations

¹ CNRS, Bordeaux INP, LaBRI, UMR 5800, Univ. Bordeaux, 33400, Talence, France. marie-galadriel.briere@u-bordeaux.fr.
² INRA, Bordeaux INP, NutriNeuro, UMR 1286, Univ. Bordeaux, 33000, Bordeaux, France. marie-galadriel.briere@u-bordeaux.fr.
³ CNRS, Bordeaux INP, LaBRI, UMR 5800, Univ. Bordeaux, 33400, Talence, France.
⁴ INSERM U1218, Institut Bergonié, Univ. Bordeaux, 33076, Bordeaux, France.

^# Contributed equally.

PMID: 34229612
PMCID: PMC8259015
DOI: 10.1186/s12859-021-04279-1

Consensus clustering applied to multi-omics disease subtyping

Galadriel Brière et al. BMC Bioinformatics. 2021.

. 2021 Jul 6;22(1):361.

doi: 10.1186/s12859-021-04279-1.

Authors

Galadriel Brière^{1

2}, Élodie Darbo^{3

4}, Patricia Thébault^#³, Raluca Uricaru^#³

Affiliations

¹ CNRS, Bordeaux INP, LaBRI, UMR 5800, Univ. Bordeaux, 33400, Talence, France. marie-galadriel.briere@u-bordeaux.fr.
² INRA, Bordeaux INP, NutriNeuro, UMR 1286, Univ. Bordeaux, 33000, Bordeaux, France. marie-galadriel.briere@u-bordeaux.fr.
³ CNRS, Bordeaux INP, LaBRI, UMR 5800, Univ. Bordeaux, 33400, Talence, France.
⁴ INSERM U1218, Institut Bergonié, Univ. Bordeaux, 33076, Bordeaux, France.

^# Contributed equally.

PMID: 34229612
PMCID: PMC8259015
DOI: 10.1186/s12859-021-04279-1

Abstract

Background: Facing the diversity of omics data and the difficulty of selecting one result over all those produced by several methods, consensus strategies have the potential to reconcile multiple inputs and to produce robust results.

Results: Here, we introduce ClustOmics, a generic consensus clustering tool that we use in the context of cancer subtyping. ClustOmics relies on a non-relational graph database, which allows for the simultaneous integration of both multiple omics data and results from various clustering methods. This new tool conciliates input clusterings, regardless of their origin, their number, their size or their shape. ClustOmics implements an intuitive and flexible strategy, based upon the idea of evidence accumulation clustering. ClustOmics computes co-occurrences of pairs of samples in input clusters and uses this score as a similarity measure to reorganize data into consensus clusters.

Conclusion: We applied ClustOmics to multi-omics disease subtyping on real TCGA cancer data from ten different cancer types. We showed that ClustOmics is robust to heterogeneous qualities of input partitions, smoothing and reconciling preliminary predictions into high-quality consensus clusters, both from a computational and a biological point of view. The comparison to a state-of-the-art consensus-based integration tool, COCA, further corroborated this statement. However, the main interest of ClustOmics is not to compete with other tools, but rather to make profit from their various predictions when no gold-standard metric is available to assess their significance.

Availability: The ClustOmics source code, released under MIT license, and the results obtained on TCGA cancer data are available on GitHub: https://github.com/galadrielbriere/ClustOmics .

Keywords: Consensus clustering; Data integration; Disease subtyping; Multi-omic data.

PubMed Disclaimer

Conflict of interest statement

The authors declare that they have no competing interests.

Figures

**Fig. 1**
Two integration scenarios: multi-to-multi consensus clustering and single-to-multi consensus clustering. Arrows are dashed according to the omics considered by each input clustering method

**Fig. 2**
Overview of survival and clinical label enrichment results for the ten cancer types analyzed. The x-axis represents the number of significant survival p values ( $< 0.01$ ) found for each clustering, over all the ten cancer types. The y-axis represents the total number of significantly enriched clinical labels (p values $< 0.01$ ), all cancer types included. In total, 79 enrichment p values were computed from 32 distinct clinical labels

**Fig. 3**
Survival analysis results for ClustOmics and COCA multi-to-multi consensus clustering and for each input multi-omics clustering. The horizontal dashed line indicates the threshold for significantly different survival rate (p value $\leq 0.01$ ). Boxplots were computed considering input clusterings only

**Fig. 4**
Adjusted Rand index of input clusterings relative to ClustOmics and COCA StoM consensus multi-omics clusterings. Each point corresponds to one clustering and is colored according to the omics type used. Each omics dataset was clustered using five different clustering tools (PINS, NEMO, SNF, rMKL, K-means), and therefore, it is represented by five input clusterings. ClustOmics and COCA respective consensus clustering similarity is displayed with a black square and a brown diamond

**Fig. 5**
Survival analysis results for ClustOmics and COCA single-to-multi All consensus clustering and for each input multi-omics clustering. The horizontal dashed line indicates the threshold for significantly different survival rate (p value $\leq 0.01$ ). Boxplots were computed considering only input clusterings

**Fig. 6**
BIC consensus clustering with patients colored according to the PAM50 prediction. Annotated screenshot from the Neo4j browser for graph visualization

**Fig. 7**
An overview of the strategy implemented in ClustOmics

**Fig. 8**
An integration graph filtered with increasing threshold values: 1, 3, and 5 (the maximum number of supports for an integration edge being 5 in this example). Screenshots from the Neo4j browser for graph visualization

See this image and copyright information in PMC

Cited by

Artificial intelligence in ovarian cancer drug resistance advanced 3PM approach: subtype classification and prognostic modeling.
Zhang C, Yang J, Chen S, Sun L, Li K, Lai G, Peng B, Zhong X, Xie B. Zhang C, et al. EPMA J. 2024 Jul 13;15(3):525-544. doi: 10.1007/s13167-024-00374-4. eCollection 2024 Sep. EPMA J. 2024. PMID: 39239109
Identification and validation of a prognostic signature of cuproptosis-related genes for esophageal squamous cell carcinoma.
Zhang Y, Chen K, Wang L, Chen J, Lin Z, Chen Y, Chen J, Lin Y, Xu Y, Peng H. Zhang Y, et al. Aging (Albany NY). 2023 Sep 2;15(17):8993-9021. doi: 10.18632/aging.205012. Epub 2023 Sep 2. Aging (Albany NY). 2023. PMID: 37665670 Free PMC article.
Integrated bioinformatic analysis of mitochondrial metabolism-related genes in acute myeloid leukemia.
Tong X, Zhou F. Tong X, et al. Front Immunol. 2023 Apr 17;14:1120670. doi: 10.3389/fimmu.2023.1120670. eCollection 2023. Front Immunol. 2023. PMID: 37138869 Free PMC article.
A Novel Neutrophil Extracellular Trap Signature Predicts Patient Chemotherapy Resistance and Prognosis in Lung Adenocarcinoma.
Xing L, Wu S, Xue S, Li X. Xing L, et al. Mol Biotechnol. 2025 May;67(5):1939-1957. doi: 10.1007/s12033-024-01170-1. Epub 2024 May 11. Mol Biotechnol. 2025. PMID: 38734842
Three decades of advancements in osteoarthritis research: insights from transcriptomic, proteomic, and metabolomic studies.
Rai MF, Collins KH, Lang A, Maerz T, Geurts J, Ruiz-Romero C, June RK, Ramos Y, Rice SJ, Ali SA, Pastrello C, Jurisica I, Thomas Appleton C, Rockel JS, Kapoor M. Rai MF, et al. Osteoarthritis Cartilage. 2024 Apr;32(4):385-397. doi: 10.1016/j.joca.2023.11.019. Epub 2023 Dec 2. Osteoarthritis Cartilage. 2024. PMID: 38049029 Free PMC article. Review.

See all "Cited by" articles

References

1. Rappoport N, Shamir R. Multi-omic and multi-view clustering algorithms: review and cancer benchmark. Nucleic Acids Res. 2018;46(20):10546–62. doi: 10.1093/nar/gky889. - DOI - PMC - PubMed
1. Tini G, Marchetti L, Priami C, Scott-Boyer M-P. Multi-omics integration-a comparison of unsupervised clustering methodologies. Brief Bioinform. 2019;20(4):1269–79. doi: 10.1093/bib/bbx167. - DOI - PubMed
1. Wu D, Wang D, Zhang MQ, Gu J. Fast dimension reduction and integrative clustering of multi-omics data using low-rank approximation: application to cancer molecular classification. BMC Genom. 2015;16:1022. doi: 10.1186/s12864-015-2223-8. - DOI - PMC - PubMed
1. Wang H, Nie F, Huang H. Multi-view clustering and feature learning via structured sparsity. In: Proceedings of the 30th international conference on international conference on machine learning—volume 28. ICML’13, pp. 352–360. JMLR.org, Atlanta, GA, USA. 2013.
1. Cabassi A, Kirk PDW. Multiple kernel learning for integrative consensus clustering of omic datasets. Bioinformatics (Oxford, England). 2020;36(18):4789–96. doi: 10.1093/bioinformatics/btaa593. - DOI - PMC - PubMed

MeSH terms

Actions
Actions
Actions
Actions
Actions
Actions

LinkOut - more resources

Full Text Sources
Medical
- MedlinePlus Health Information

[1] Rappoport N, Shamir R. Multi-omic and multi-view clustering algorithms: review and cancer benchmark. Nucleic Acids Res. 2018;46(20):10546–62. doi: 10.1093/nar/gky889. - DOI - PMC - PubMed

[2] Rappoport N, Shamir R. Multi-omic and multi-view clustering algorithms: review and cancer benchmark. Nucleic Acids Res. 2018;46(20):10546–62. doi: 10.1093/nar/gky889. - DOI - PMC - PubMed

[3] Tini G, Marchetti L, Priami C, Scott-Boyer M-P. Multi-omics integration-a comparison of unsupervised clustering methodologies. Brief Bioinform. 2019;20(4):1269–79. doi: 10.1093/bib/bbx167. - DOI - PubMed

[4] Tini G, Marchetti L, Priami C, Scott-Boyer M-P. Multi-omics integration-a comparison of unsupervised clustering methodologies. Brief Bioinform. 2019;20(4):1269–79. doi: 10.1093/bib/bbx167. - DOI - PubMed

[5] Wu D, Wang D, Zhang MQ, Gu J. Fast dimension reduction and integrative clustering of multi-omics data using low-rank approximation: application to cancer molecular classification. BMC Genom. 2015;16:1022. doi: 10.1186/s12864-015-2223-8. - DOI - PMC - PubMed

[6] Wu D, Wang D, Zhang MQ, Gu J. Fast dimension reduction and integrative clustering of multi-omics data using low-rank approximation: application to cancer molecular classification. BMC Genom. 2015;16:1022. doi: 10.1186/s12864-015-2223-8. - DOI - PMC - PubMed

[7] Wang H, Nie F, Huang H. Multi-view clustering and feature learning via structured sparsity. In: Proceedings of the 30th international conference on international conference on machine learning—volume 28. ICML’13, pp. 352–360. JMLR.org, Atlanta, GA, USA. 2013.

[8] Wang H, Nie F, Huang H. Multi-view clustering and feature learning via structured sparsity. In: Proceedings of the 30th international conference on international conference on machine learning—volume 28. ICML’13, pp. 352–360. JMLR.org, Atlanta, GA, USA. 2013.

[9] Cabassi A, Kirk PDW. Multiple kernel learning for integrative consensus clustering of omic datasets. Bioinformatics (Oxford, England). 2020;36(18):4789–96. doi: 10.1093/bioinformatics/btaa593. - DOI - PMC - PubMed

[10] Cabassi A, Kirk PDW. Multiple kernel learning for integrative consensus clustering of omic datasets. Bioinformatics (Oxford, England). 2020;36(18):4789–96. doi: 10.1093/bioinformatics/btaa593. - DOI - PMC - PubMed

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Consensus clustering applied to multi-omics disease subtyping

Affiliations

Consensus clustering applied to multi-omics disease subtyping

Authors

Affiliations

Abstract

Conflict of interest statement

Figures

Similar articles

Cited by

References

MeSH terms

LinkOut - more resources

Full Text Sources

Medical

Abstract

Conflict of interest statement

Figures

Similar articles

Cited by

References

MeSH terms

Related information

LinkOut - more resources

Full Text Sources

Medical