Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2024 Jun 29:23:2798-2810.
doi: 10.1016/j.csbj.2024.06.035. eCollection 2024 Dec.

Mime: A flexible machine-learning framework to construct and visualize models for clinical characteristics prediction and feature selection

Affiliations

Mime: A flexible machine-learning framework to construct and visualize models for clinical characteristics prediction and feature selection

Hongwei Liu et al. Comput Struct Biotechnol J. .

Abstract

The widespread use of high-throughput sequencing technologies has revolutionized the understanding of biology and cancer heterogeneity. Recently, several machine-learning models based on transcriptional data have been developed to accurately predict patients' outcome and clinical response. However, an open-source R package covering state-of-the-art machine-learning algorithms for user-friendly access has yet to be developed. Thus, we proposed a flexible computational framework to construct a machine learning-based integration model with elegant performance (Mime). Mime streamlines the process of developing predictive models with high accuracy, leveraging complex datasets to identify critical genes associated with prognosis. An in silico combined model based on de novo PIEZO1-associated signatures constructed by Mime demonstrated high accuracy in predicting the outcomes of patients compared with other published models. Furthermore, the PIEZO1-associated signatures could also precisely infer immunotherapy response by applying different algorithms in Mime. Finally, SDC1 selected from the PIEZO1-associated signatures demonstrated high potential as a glioma target. Taken together, our package provides a user-friendly solution for constructing machine learning-based integration models and will be greatly expanded to provide valuable insights into current fields. The Mime package is available on GitHub (https://github.com/l-magnificence/Mime).

Keywords: GitHub; Machine learning; Mime; PIEZO1; Prediction models; R package.

PubMed Disclaimer

Conflict of interest statement

All authors declared that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Figures

ga1
Graphical abstract
Fig. 1
Fig. 1
A schematic diagram of Mime. Mime streamlined the process of developing models for accurately predicting outcomes and therapeutic responses of patients, leveraging complex datasets to identify critical genes associated with prognosis.
Fig. 2
Fig. 2
Construction of prognostic models based on PIEZO1-associated signature. A. A workflow about the construction of prognostic models based on PIEZO1-associated signature. B. DEGs identified between control and PIEZO1 knockdown condition in G508 and G532. Top histogram: number of DEGs intersected in multiple conditions. C. C-index of each model among different cohorts sorted by the average of C-index in validation cohorts. D. The relation between risk score calculated by StepCox[forward]-Ridge combined model and outcome of patients in different cohorts.
Fig. 3
Fig. 3
Performance of prognostic models. A. 1-year, 3-year and 5-year AUC of top 15 models among different cohorts sorted by the average AUC in validation cohorts. The black font numbers indicate that the risk score calculated by this model predicted a better outcome in the corresponding cohort, otherwise indicate a worse outcome prediction. B. 1-year, 3-year and 5-year AUC of StepCox[forward]-Ridge combined model among different cohorts. C. Meta-analysis of univariate cox result of StepCox[forward]-Ridge combined model among different cohorts. D. Multivariate cox result of StepCox[forward]-Ridge combined model in four independent cohorts.
Fig. 4
Fig. 4
Comparison with previously established models in glioma. A. HR of StepCox[forward]-Ridge combined model and 95 published models across 9 cohorts. B. C-index of StepCox[forward]-Ridge combined model and 95 published models across 9 cohorts. C. 1-year AUC predicted by StepCox[forward]-Ridge combined model and 95 published models across 9 cohorts.
Fig. 5
Fig. 5
Correlation between risk score and immune or genome signatures. A. Relationship between risk score calculated by StepCox[forward]-Ridge combined model and microenvironment signatures deconvoluted by different methods in TCGA glioma cohort. Method IPS was from package IOBR, while other methods were from package immunedeconv. B. Same as A but in CGGA.693 cohort. C. Correlation between risk score and various immune genes. D-F. Correlation between risk score and loss of heterozygosity segment number (D), loss of heterozygosity alteration fraction (E), CNA alteration fraction, homologous recombination deficiency score, non-silent mutations, and aneuploidy score (F), respectively. * P < 0.05, * * P < 0.01, * ** P < 0.001, * ** * P < 0.0001.
Fig. 6
Fig. 6
Construction of predictive models for immunotherapy benefits. A. A workflow about the construction of predictive models for immunotherapy benefits based on PIEZO1-associated signature. B. ROC curves of each model to predict the benefits of immunotherapy in training and validation datasets. Bottom right histogram: The distribution of AUC predicted by 7 machine-learning models in the training and validation dataset. C. AUC predicted by adaboost model and 13 published models across training and validation datasets.
Fig. 7
Fig. 7
Characteristic of SDC1 in glioma. A. A workflow of the identification of critical genes based on PIEZO1-associated signature. B. Prognosis-associated genes selected by different machine-learning algorithms. Top histogram: number of genes intersected by multiple models. Top right chart: Frequency of genes selected by different models. C. Expression level of SDC1 between control and PIEZO1 knockdown condition in G508 and G532. Statistic test: t-test. D. Pearson correlation between PIEZO1 and SDC1 in different cohorts. E. The relationship between expression of SDC1 and other clinical features (Grade, Histology, IDH mutation, 1p/19q status and transcriptional subtypes) in TCGA, CGGA.325, CGGA.693 and GSE16011 cohorts. Statistic test: chi-square test. F. GO enrichment of SDC1-regulated genes. Normal enrichment score (NES) > 0 indicated up-regulated processes in SDC1 high-expression group otherwise down-regulated processes in SDC1 high-expression group.
Fig. 8
Fig. 8
A fundamental exploration within shiny application of Mime. Users can interactively obtain visualization results through uploading the local outputs of Mime.

References

    1. Reuter J.A., Spacek D.V., Snyder M.P. High-throughput sequencing technologies. Mol Cell. 2015;58(4):586–597. - PMC - PubMed
    1. Adam G., Rampášek L., Safikhani Z., Smirnov P., Haibe-Kains B., Goldenberg A. Machine learning approaches to drug response prediction: challenges and recent progress. NPJ Precis Oncol. 2020;4:19. - PMC - PubMed
    1. Ding L., Bailey M.H., Porta-Pardo E., Thorsson V., Colaprico A., Bertrand D., et al. Perspective on Oncogenic Processes at the End of the Beginning of Cancer Genomics. Cell. 2018;173(2):305–320. e310. - PMC - PubMed
    1. Hanahan D. Hallmarks of cancer: new dimensions. Cancer Discov. 2022;12(1):31–46. - PubMed
    1. Kourou K., Exarchos T.P., Exarchos K.P., Karamouzis M.V., Fotiadis D.I. Machine learning applications in cancer prognosis and prediction. Comput Struct Biotechnol J. 2015;13:8–17. - PMC - PubMed

LinkOut - more resources