Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2024 Apr 3:15:1242636.
doi: 10.3389/fgene.2024.1242636. eCollection 2024.

Donor whole blood DNA methylation is not a strong predictor of acute graft versus host disease in unrelated donor allogeneic haematopoietic cell transplantation

Affiliations

Donor whole blood DNA methylation is not a strong predictor of acute graft versus host disease in unrelated donor allogeneic haematopoietic cell transplantation

Amy P Webster et al. Front Genet. .

Abstract

Allogeneic hematopoietic cell transplantation (HCT) is used to treat many blood-based disorders and malignancies, however it can also result in serious adverse events, such as the development of acute graft-versus-host disease (aGVHD). This study aimed to develop a donor-specific epigenetic classifier to reduce incidence of aGVHD by improving donor selection. Genome-wide DNA methylation was assessed in a discovery cohort of 288 HCT donors selected based on recipient aGVHD outcome; this cohort consisted of 144 cases with aGVHD grades III-IV and 144 controls with no aGVHD. We applied a machine learning algorithm to identify CpG sites predictive of aGVHD. Receiver operating characteristic (ROC) curve analysis of these sites resulted in a classifier with an encouraging area under the ROC curve (AUC) of 0.91. To test this classifier, we used an independent validation cohort (n = 288) selected using the same criteria as the discovery cohort. Attempts to validate the classifier failed with the AUC falling to 0.51. These results indicate that donor DNA methylation may not be a suitable predictor of aGVHD in an HCT setting involving unrelated donors, despite the initial promising results in the discovery cohort. Our work highlights the importance of independent validation of machine learning classifiers, particularly when developing classifiers intended for clinical use.

Keywords: DNA methylation; HCT (hematopoietic cell transplant); biomarker identification and validation; epigenetics; haematopoietic stem cell transplant; machine learning.

PubMed Disclaimer

Conflict of interest statement

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Figures

FIGURE 1
FIGURE 1
Study Design. Unrelated donor-recipient pairs were selected based on the outcome of recipients following HCT. DNA methylation levels were assessed in donors associated with no (Grade 0) or severe (Grades 3–4) aGVHD in recipients. Donor-recipient pairs were HLA matched, and comparison groups were matched for sex, age, disease and GVHD prophylaxis. Feature selection reduced the number of probes in the discovery dataset to 10,000 for input to random forest analyses, and this classifier was subsequently tested in the validation cohort following pre-processing of data and refinement to the same set of probes.
FIGURE 2
FIGURE 2
ROC curve of classifier performance of the unsupervised Random Forest Classifier. Plot (A) shows the performance of the variable probe based (unsupervised approach) classifier which used the top 10,000 most variable CpG sites as input, during internal cross validation on the training dataset. Plot (B) shows the performance of the variability based classifier on the independent validation cohort, with an AUC of 0.523, a sensitivity of 50.0% and a very poor specificity of 51.4%.
FIGURE 3
FIGURE 3
ROC curve of classifier performance of the supervised Random Forest Classifier. The figure shows the performance of the differential methylation (supervised approach) classifier which used the top 10,000 most differentially methylated CpG sites as input, during internal cross validation on the training dataset (blue line). The performance of the differential methylation classifier on the independent validation cohort is indicated by the orange line, which had an AUC of 0.508, a sensitivity of 90.97% with a very poor specificity of 6.25%. While initially this differential methylation-based classifier appeared encouraging with the discovery cohort, the classifier did not perform well during validation analyses.
FIGURE 4
FIGURE 4
ROC curves of classifiers developed using additional machine learning methods. The additional machine learning methods applied to the supervised dataset were Support Vector Machines (SVM), Gradient Boosting Machines (GBM), k-Nearest Neighbours (KNN), Multi-layer perceptrons (MLP) and Logistic Regression (LR). For each model, we explored a range of hyperparameters through a grid search approach. Each experiment was executed 40 times with different random seeds, resulting in training over 2300 models in total. The ROC curves illustrate the best performing models which reached a maximum validation AUC of 0.6 for the LR method. Plot (A) shows the performance of these models in the discovery cohort while plot (B) shows the performance in the validation cohort.

References

    1. Al-Kadhimi Z., Gul Z., Chen W., Smith D., Abidi M., Deol A., et al. (2014). High incidence of severe acute graft-versus-host disease with tacrolimus and mycophenolate mofetil in a large cohort of related and unrelated allogeneic transplantation patients. Biol. Blood Marrow Transpl. 20, 979–985. 10.1016/j.bbmt.2014.03.016 - DOI - PMC - PubMed
    1. Birdwell C. E., Queen K. J., Kilgore P. C. S. R., Rollyson P., Trutschl M., Cvek U., et al. (2014). Genome-wide DNA methylation as an epigenetic consequence of Epstein-Barr virus infection of immortalized keratinocytes. J. Virol. 88, 11442–11458. 10.1128/JVI.00972-14 - DOI - PMC - PubMed
    1. Blecua P., Martinez-Verbo L., Esteller M. (2020). The DNA methylation landscape of hematological malignancies: an update. Mol. Oncol. 14, 1616–1639. 10.1002/1878-0261.12744 - DOI - PMC - PubMed
    1. Breiman L. (2001). Random forests. Mach. Learn. 45, 5–32. 10.1023/a:1010933404324 - DOI
    1. Capper D., Jones D. T. W., Sill M., Hovestadt V., Schrimpf D., Sturm D., et al. (2018). DNA methylation-based classification of central nervous system tumours. Nature 555, 469–474. 10.1038/nature26000 - DOI - PMC - PubMed