Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2023 Apr 6;23(1):61.
doi: 10.1186/s12935-023-02905-x.

Molecular subtypes predict therapeutic responses and identifying and validating diagnostic signatures based on machine learning in chronic myeloid leukemia

Affiliations

Molecular subtypes predict therapeutic responses and identifying and validating diagnostic signatures based on machine learning in chronic myeloid leukemia

Fang-Min Zhong et al. Cancer Cell Int. .

Abstract

Chronic myeloid leukemia (CML) is a hematological tumor derived from hematopoietic stem cells. The aim of this study is to analyze the biological characteristics and identify the diagnostic markers of CML. We obtained the expression profiles from the Gene Expression Omnibus (GEO) database and identified 210 differentially expressed genes (DEGs) between CML and normal samples. These DEGs are mainly enriched in immune-related pathways such as Th1 and Th2 cell differentiation, primary immunodeficiency, T cell receptor signaling pathway, antigen processing and presentation pathways. Based on these DEGs, we identified two molecular subtypes using a consensus clustering algorithm. Cluster A was an immunosuppressive phenotype with reduced immune cell infiltration and significant activation of metabolism-related pathways such as reactive oxygen species, glycolysis and mTORC1; Cluster B was an immune activating phenotype with increased infiltration of CD4 + and CD8 + T cells and NK cells, and increased activation of signaling pathways such as interferon gamma (IFN-γ) response, IL6-JAK-STAT3 and inflammatory response. Drug prediction results showed that patients in Cluster B had a higher therapeutic response to anti-PD-1 and anti-CTLA4 and were more sensitive to imatinib, nilotinib and dasatinib. Support Vector Machine Recursive Feature Elimination (SVM-RFE), Least Absolute Shrinkage Selection Operator (LASSO) and Random Forest (RF) algorithms identified 4 CML diagnostic genes (HDC, SMPDL3A, IRF4 and AQP3), and the risk score model constructed by these genes improved the diagnostic accuracy. We further validated the diagnostic value of the 4 genes and the risk score model in a clinical cohort, and the risk score can be used in the differential diagnosis of CML and other hematological malignancies. The risk score can also be used to identify molecular subtypes and predict response to imatinib treatment. These results reveal the characteristics of immunosuppression and metabolic reprogramming in CML patients, and the identification of molecular subtypes and biomarkers provides new ideas and insights for the clinical diagnosis and treatment.

Keywords: Chronic myeloid leukemia; Diagnosis; Machine learning; Therapeutic response; Tumor microenvironment.

PubMed Disclaimer

Conflict of interest statement

The authors have no competing interests to disclose.

Figures

Fig. 1
Fig. 1
Identification of differentially expressed genes (DEGs) between CML and normal samples. A, B Volcano plots of datasets GSE13159 (A) and GSE144119 (B) from the GEO database; blue dots represent up-regulated DEGs, gray dots represent nonsignificant genes, and red dots represent down-regulated DEGs. The top 10 high- and low-expressing DEGs with the smallest adjusted P-values would be listed. C The heatmap shows DEGs with common expression trend in both cohorts. D KEGG enrichment analysis of DEGs. E GO annotation analysis of DEGs
Fig. 2
Fig. 2
Analysis of immune cell infiltration and prediction of upstream regulatory network according to the DEGs. A GSEA analysis of enrichment pathways in the CML group. B GSEA analysis of enrichment pathways in the Normal group. C Differences in infiltration of 22 immune cells between CML and normal samples. D Differences in expression of immune checkpoints between CML and normal samples. E PPI network of the DEGs. F The subnetwork of the top 20 most connected genes in the PPI network. G Kinases and H transcription factors according to the predictions of the DEGs. I Regulatory network diagram according to the prediction of the DEGs. Nodes’ size is scaled proportional to the corresponding degree
Fig. 3
Fig. 3
Identification of molecular subtypes of CML and prediction of drug response in different subtypes. A The consensus clustering algorithm divided CML patients into two different molecular subtypes based on the expression of DEGs. B PCA algorithm was used to verify the classification reliability of the two molecular subtypes. CF Differences in expression of DEGs (C), infiltration of 22 immune cells (D), activity of tumor hallmark gene sets (E) and TIDE scores (F) between the two molecular subtypes. (G) Differences in the therapeutic response of the two molecular subtypes to immune checkpoint inhibitors. HK Differences in therapeutic sensitivity of the two molecular subtypes to four TKIs
Fig. 4
Fig. 4
Screening diagnostic markers for CML. A, B Diagnostic markers were screened by the LASSO regression algorithm. C, D Diagnostic markers were screened by the RF algorithm. E Diagnostic markers were screened by the SVM-RFE algorithm. F Venn diagram of variables screened by LASSO, RF and SVM-RFE algorithms. G Differences in expression of the four diagnostic genes in the GSE13159 cohort. H Differences in expression of the four diagnostic genes in the GSE144119 cohort. I Coefficients for the four genes in the risk score model. J Distribution of risk scores in the GSE13159 cohort. K Distribution of risk scores in the GSE144119 cohort
Fig. 5
Fig. 5
Analysis of the diagnostic value of diagnostic markers. A, B ROC curve analysis of the diagnostic value of the four diagnostic genes and the risk score in the GSE13159 (A) and GSE144119 (B) cohorts. C, D Differences in risk scores between different molecular subtypes in the GSE13159 (C) and GSE144119 (D) cohorts. E Differences in risk scores between patients who responded and did not respond to imatinib treatment in the GSE2535 cohort
Fig. 6
Fig. 6
Correlation analysis of diagnostic biomarkers with biological characteristics. A Correlation analysis of four diagnostic genes, risk score and immune cells. B Correlation analysis of four diagnostic genes, risk score, and cancer-related signaling pathways. C Regulatory network of miRNAs and four diagnostic genes; red indicates miRNA expression is up-regulated in CML samples, green indicates expression is down-regulated, and blue indicates that expression data for the relevant miRNAs were not obtained. D The interaction network indicates drugs that may have regulatory relationships with HDC
Fig. 7
Fig. 7
Clinical independent cohort validated diagnostic markers of CML. AC Our sequencing cohort validates the differences in four diagnostic genes and risk scores between CML and normal samples and the diagnostic value of risk score in CML. DF Our clinical PCR cohort validated the differences in four diagnostic genes and risk scores between CML samples and normal samples, as well as the diagnostic value
Fig. 8
Fig. 8
The value of diagnostic markers in the differential diagnosis between CML and other hematological malignancies. A PCA plot shows clustering features of CML, AML, CLL, ALL, MDS and normal samples based on expression of the four diagnostic genes. B Differential expression of four diagnostic genes in CML, AML, CLL, ALL, MDS and normal samples. C Differences in the distribution of risk scores among CML, AML, CLL, ALL, MDS and normal samples. D ROC curve analysis of the differential diagnostic value of the risk score in CML versus other hematological malignancies

Similar articles

Cited by

References

    1. Nash I. Chronic myeloid leukemia. N Engl J Med. 1999;341:765. doi: 10.1056/nejm199909023411016. - DOI - PubMed
    1. Jabbour E, Kantarjian H. Chronic myeloid leukemia: 2018 update on diagnosis, therapy and monitoring. Am J Hematol. 2018;93:442–459. doi: 10.1002/ajh.25011. - DOI - PubMed
    1. Lugo TG, Pendergast AM, Muller AJ, Witte ON. Tyrosine kinase activity and transformation potency of bcr-abl oncogene products. Science. 1990;247:1079–1082. doi: 10.1126/science.2408149. - DOI - PubMed
    1. Hochhaus A, et al. Long-Term outcomes of imatinib treatment for chronic myeloid leukemia. N Engl J Med. 2017;376:917–927. doi: 10.1056/NEJMoa1609324. - DOI - PMC - PubMed
    1. Iqbal Z, et al. Sensitive detection of pre-existing BCR-ABL kinase domain mutations in CD34+ cells of newly diagnosed chronic-phase chronic myeloid leukemia patients is associated with imatinib resistance: implications in the post-imatinib era. PLoS ONE. 2013;8:e55717. doi: 10.1371/journal.pone.0055717. - DOI - PMC - PubMed