Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2025 Sep 9;17(18):2952.
doi: 10.3390/cancers17182952.

Survival Machine Learning Methods Improve Prediction of Histologic Transformation in Follicular and Marginal Zone Lymphomas

Affiliations

Survival Machine Learning Methods Improve Prediction of Histologic Transformation in Follicular and Marginal Zone Lymphomas

Tong-Yoon Kim et al. Cancers (Basel). .

Abstract

Background/objectives: Follicular lymphoma (FL) and marginal zone lymphoma (MZL) are low-grade B-cell lymphomas (LGBCLs) with indolent clinical courses but a lifelong risk of histologic transformation (HT) to aggressive lymphomas, particularly diffuse large B-cell lymphoma. Predicting HT can be challenging due to class imbalances and the inherent complexity of time-dependent events. While there are current prognostic indices for survival, they do not specifically address HT risk. This study aimed to develop and validate survival-based and traditional classification machine-learning models to predict HT in cohorts.

Methods: Using a multicenter retrospective dataset (n = 1068), survival models (Cox proportional hazards, Lasso-Cox, Random Survival Forest, Gradient-boosted Cox [GBM-Cox], eXtreme Gradient Boosting [XGBoost]-Cox), and classification models (Logistic regression, Lasso logistic, Random Forest, Gradient Boosting, XGBoost) were compared. The best-performing survival models-XGBoost-Cox, Lasso-Cox, and GBM-Cox-were assessed on an independent test set (n = 92). Model sensitivity was maximized using optimal binary risk cutoff points based on Youden's index.

Results: Survival models showed superior predictive performance than classical classifiers, with XGBoost-Cox exhibiting the highest mean accuracy (85.3%), time-dependent area under the curve (0.795), sensitivity (98%), specificity (83.9%), and concordance index (0.836). Incorporating next-generation sequencing (NGS) data improved model accuracy and specificity, indicating that genetic factors improve HT prediction. Principal component analysis revealed distinct gene mutation patterns associated with HT risk, highlighting DNA-repair genes such as TP53, BLM, and RAD50.

Conclusions: This study highlights the clinical value of survival-based machine-learning methods integrated with NGS data to personalize HT risk stratification for patients with FL and MZL.

Keywords: aggressive histologic transformation; follicular lymphoma; low-grade B-cell lymphoma; marginal zone lymphoma; survival machine learning.

PubMed Disclaimer

Conflict of interest statement

The authors declare no conflicts of interest.

Figures

Figure 1
Figure 1
Overall survival and cumulative incidence of histologic transformation (HT) in the training and validation cohorts. (A) Kaplan–Meier curves for overall survival (OS) in the training set (n = 592), stratified by HT status. (B) OS curves in the validation/test set (n = 92), comparing patients with and without HT. (C) Cumulative incidence (CI) of HT in the training set, accounting for death as a competing risk. (D) CI of HT in the validation/test set, with death as a competing risk.
Figure 2
Figure 2
Time-dependent AUC in the training/validation cohorts and performance on the test set. (A) Time-dependent AUC for five landmark times (6, 12, 18, 24, and 36 months), comparing survival-based and classification-based models in the training and validation cohorts. Blue lines denote survival-based models; orange lines denote classification-based models. (B) Test set performance (n = 92) over the same timepoints for the three top survival models (XGBoost-Cox, Lasso-Cox, and GBM-Cox), showing accuracy, time-dependent AUC, sensitivity, and specificity. (C) Concordance indices (C-index) for GBM-Cox, Lasso-Cox, and XGBoost-Cox on the test set.
Figure 3
Figure 3
Test-set evaluation of XGBoost-Cox models and NGS predictor patterns. (A) Time-dependent performance of the XGBoost-Cox model with and without NGS at 6, 12, 18, 24, and 36 months for four metrics, showing accuracy, time-dependent AUC, sensitivity, and specificity. (B) Bar plots showing the prevalence of the top 15 NGS-derived predictor variables in patients who developed HT (blue) and those who did not (orange) in the test cohort (n = 92). (C) PCA biplot of the same 15 NGS predictors in the test set. Arrows indicate the projection directions of the model without NGS (orange), the model with NGS (navy), and the observed HT vector (green), showing alignment with actual transformation events. Gray dots correspond to individual gene loadings on Principal Components 1 (x-axis, 55 % variance) and 2 (y-axis, 38.4 % variance).

References

    1. Perry A.M., Diebold J., Nathwani B.N., MacLennan K.A., Müller-Hermelink H.K., Bast M., Boilesen E., Armitage J.O., Weisenburger D.D. Non-Hodgkin lymphoma in the developing world: Review of 4539 cases from the International Non-Hodgkin Lymphoma Classification Project. Haematologica. 2016;101:1244–1250. doi: 10.3324/haematol.2016.148809. - DOI - PMC - PubMed
    1. Teras L.R., DeSantis C.E., Cerhan J.R., Morton L.M., Jemal A., Flowers C.R. 2016 US lymphoid malignancy statistics by World Health Organization subtypes. CA Cancer J. Clin. 2016;66:443–459. doi: 10.3322/caac.21357. - DOI - PubMed
    1. Abro B., Maurer M.J., Habermann T.M., Burack W.R., Chapman J.R., Cohen J.B., Friedberg J.W., Inghirami G., Kahl B.S., Larson M.C., et al. Real-world impact of differences in the WHO and ICC classifications of non-hodgkin lymphoma: A LEO cohort study analysis. Blood. 2024;144:2063–2066. doi: 10.1182/blood.2024025681. - DOI - PMC - PubMed
    1. Wagner-Johnston N.D., Link B.K., Byrtek M., Dawson K.L., Hainsworth J., Flowers C.R., Friedberg J.W., Bartlett N.L. Outcomes of transformed follicular lymphoma in the modern era: A report from the National LymphoCare Study (NLCS) Blood. 2015;126:851–857. doi: 10.1182/blood-2015-01-621375. - DOI - PMC - PubMed
    1. Bult J.A.A., Huisman F., Zhong Y., Veltmaat N., Kluiver J., Tonino S.H., Vermaat J.S.P., Chamuleau M.E.D., Diepstra A., van den Berg A., et al. A population-based study of transformed marginal zone lymphoma: Identifying outcome-related characteristics. Blood Cancer J. 2023;13:130. doi: 10.1038/s41408-023-00903-w. - DOI - PMC - PubMed

LinkOut - more resources