A machine learning approach in a monocentric cohort for predicting primary refractory disease in Diffuse Large B-cell lymphoma patients

Affiliations

¹ Department of Technology and Information Systems, Grand Hôpital de Charleroi, Charleroi, Belgium.
² Department of Clinical Research, Grand Hôpital de Charleroi, Charleroi, Belgium.
³ Department of Medico-Economic Information, Grand Hôpital de Charleroi, Charleroi, Belgium.
⁴ School of Public Health, Université Libre de Bruxelles (U.L.B.), Brussels, Belgium.
⁵ Division of Hematology, Hematology and oncology Department, Grand Hôpital de Charleroi, Charleroi, Belgium.

PMID: 39352921
PMCID: PMC11444388
DOI: 10.1371/journal.pone.0311261

A machine learning approach in a monocentric cohort for predicting primary refractory disease in Diffuse Large B-cell lymphoma patients

Marie Y Detrait et al. PLoS One. 2024.

. 2024 Oct 1;19(10):e0311261.

doi: 10.1371/journal.pone.0311261. eCollection 2024.

Authors

Affiliations

¹ Department of Technology and Information Systems, Grand Hôpital de Charleroi, Charleroi, Belgium.
² Department of Clinical Research, Grand Hôpital de Charleroi, Charleroi, Belgium.
³ Department of Medico-Economic Information, Grand Hôpital de Charleroi, Charleroi, Belgium.
⁴ School of Public Health, Université Libre de Bruxelles (U.L.B.), Brussels, Belgium.
⁵ Division of Hematology, Hematology and oncology Department, Grand Hôpital de Charleroi, Charleroi, Belgium.

PMID: 39352921
PMCID: PMC11444388
DOI: 10.1371/journal.pone.0311261

Abstract

Introduction: Primary refractory disease affects 30-40% of patients diagnosed with DLBCL and is a significant challenge in disease management due to its poor prognosis. Predicting refractory status could greatly inform treatment strategies, enabling early intervention. Various options are now available based on patient and disease characteristics. Supervised machine-learning techniques, which can predict outcomes in a medical context, appear highly suitable for this purpose.

Design: Retrospective monocentric cohort study.

Patient population: Adult patients with a first diagnosis of DLBCL admitted to the hematology unit from 2017 to 2022.

Aim: We evaluated in our Center five supervised machine-learning (ML) models as a tool for the prediction of primary refractory DLBCL.

Main results: One hundred and thirty patients with Diffuse Large B-cell lymphoma (DLBCL) were included in this study between January 2017 and December 2022. The variables used for analysis included demographic characteristics, clinical condition, disease characteristics, first-line therapy and PET-CT scan realization after 2 cycles of treatment. We compared five supervised ML models: support vector machine (SVM), Random Forest Classifier (RFC), Logistic Regression (LR), Naïve Bayes (NB) Categorical classifier and eXtreme Gradient Boost (XGboost), to predict primary refractory disease. The performance of these models was evaluated using the area under the receiver operating characteristic curve (ROC-AUC), accuracy, false positive rate, sensitivity, and F1-score to identify the best model. After a median follow-up of 19.5 months, the overall survival rate was 60% in the cohort. The Overall Survival at 3 years was 58.5% (95%CI, 51-68.5) and the 3-years Progression Free Survival was 63% (95%CI, 54-71) using Kaplan-Meier method. Of the 124 patients who received a first line treatment, primary refractory disease occurred in 42 patients (33.8%) and 2 patients (1.6%) experienced relapse within 6 months. The univariate analysis on refractory disease status shows age (p = 0.009), Ann Arbor stage (p = 0.013), CMV infection (p = 0.012), comorbidity (p = 0.019), IPI score (p<0.001), first line of treatment (p<0.001), EBV infection (p = 0.008) and socio-economics status (p = 0.02) as influencing factors. The NB Categorical classifier emerged as the top-performing model, boasting a ROC-AUC of 0.81 (95% CI, 0.64-0.96), an accuracy of 83%, a F1-score of 0.82, and a low false positive rate at 10% on the validation set. The eXtreme Gradient Boost (XGboost) model and the Random Forest Classifier (RFC) followed with a ROC-AUC of 0.74 (95%CI, 0.52-0.93) and 0.67 (95%CI, 0.46-0.88) respectively, an accuracy of 78% and 72% respectively, a F1-score of 0.75 and 0.67 respectively, and a false positive rate of 10% for both. The other two models performed worse with ROC-AUC of 0.65 (95%CI, 0.40-0.87) and 0.45 (95%CI, 0.29-0.64) for SVM and LR respectively, an accuracy of 67% and 50% respectively, a f1-score of 0.64 and 0.43 respectively, and a false positive rate of 28% and 37% respectively.

Conclusion: Machine learning algorithms, particularly the NB Categorical classifier, have the potential to improve the prediction of primary refractory disease in DLBCL patients, thereby providing a novel decision-making tool for managing this condition. To validate these results on a broader scale, multicenter studies are needed to confirm the results in larger cohorts.

Copyright: © 2024 Detrait et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

PubMed Disclaimer

Conflict of interest statement

The authors have declared that no competing interests exist.

Figures

**Fig 1. Overall survival using Kaplan-Meier method.**

**Fig 2. Progression free survival using Kaplan-Meier method.**

**Fig 3. The ROC-AUC curves for each model.**
The characteristics of the best setting for the algorithms to which a GridSearchCV was applied: RF: max_depth: 5, n_estimators: 50, max_samples: 0.75, min_samples_splits: 10, and min_samples_leaf: 2; XGBoost: coldsample_bytree:0.8, learning_rate:0.2, max_depth:5, n_estimators:50, and subsample:0.8; and NuSVC: gamma:0.01 and kernel: rbf. All details are provided in the repository on https://github.com/MarieDetrait-MD/PrimaryRefractoryDLBCL.

See this image and copyright information in PMC

References

1. Sehn LH, Salles G. Diffuse large B cell lymphoma. N Engl J Med. 2021; 384(9): 842–858. doi: 10.1056/NEJMra2027612 - DOI - PMC - PubMed
1. Sarkozy C, Sehn LH. Management of relapsed/refractory DLBCL. Best Pract Res Clin Haematol. 2018; 31(3): 209–16. doi: 10.1016/j.beha.2018.07.014 - DOI - PubMed
1. Harris LJ, Patel K, Martin M. Novel Therapies for relapsed and Refractory DLBCL; Int J Mol Sci. 2020; 21(22): 8553. - PMC - PubMed
1. Tilly H, Gomes da Silva M, Vitolo U, Jack A, Meignan M, Lopez-Guillermo A et al.. DLBCL: ESMO clinical practice guidelines for diagnosis, treatment and follow-up. Ann Oncol. 2015; 26: v116–25. - PubMed
1. Maurer MJ, Jakobsen LH, Mwangi R, Schmitz N, Farooq U, Flowers CR et al.. Relapsed/Refractory International Prognostic Index: an international prognostic calculator for relapsed/refractory diffuse large B-cell lymphoma. AM J Haematol. 2021; 96(5): 599–605 - PMC - PubMed

MeSH terms

Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions

LinkOut - more resources

Full Text Sources
- PubMed Central
- Public Library of Science

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

A machine learning approach in a monocentric cohort for predicting primary refractory disease in Diffuse Large B-cell lymphoma patients

Affiliations

A machine learning approach in a monocentric cohort for predicting primary refractory disease in Diffuse Large B-cell lymphoma patients

Authors

Affiliations

Abstract

Conflict of interest statement

Figures

References

MeSH terms

LinkOut - more resources

Full Text Sources