Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2024 Oct 2;8(1):218.
doi: 10.1038/s41698-024-00718-3.

Comparison of DNA methylation based classification models for precision diagnostics of central nervous system tumors

Affiliations

Comparison of DNA methylation based classification models for precision diagnostics of central nervous system tumors

Quynh T Tran et al. NPJ Precis Oncol. .

Abstract

As part of the advancement in therapeutic decision-making for brain tumor patients at St. Jude Children's Research Hospital (SJCRH), we developed three robust classifiers, a deep learning neural network (NN), k-nearest neighbor (kNN), and random forest (RF), trained on a reference series DNA-methylation profiles to classify central nervous system (CNS) tumor types. The models' performance was rigorously validated against 2054 samples from two independent cohorts. In addition to classic metrics of model performance, we compared the robustness of the three models to reduced tumor purity, a critical consideration in the clinical utility of such classifiers. Our findings revealed that the NN model exhibited the highest accuracy and maintained a balance between precision and recall. The NN model was the most resistant to drops in performance associated with a reduction in tumor purity, showing good performance until the purity fell below 50%. Through rigorous validation, our study emphasizes the potential of DNA-methylation-based deep learning methods to improve precision medicine for brain tumor classification in the clinical setting.

PubMed Disclaimer

Conflict of interest statement

The authors declare no competing interests.

Figures

Fig. 1
Fig. 1. Leave-out-25% testing results for each methylation class.
a Heat map showing results of methylation class prediction after 1000 stratified random samplings of i RF, ii kNN, and iii NN classifier incorporating information of n = 2801 reference tumor samples allocated to 91 methylation classes (GSE90496). Deviations from the bisecting line represent misclassification errors (using the maximum calibrated score for class prediction). Boxplots showing (b) the accuracy, (c) precision and recall, and (d) F1-score for each classifier with outliers.
Fig. 2
Fig. 2. Precision and recall above a classification probabilistic threshold for methylation class and family of each classifier.
a Precision and recall when predicting samples in GSE109379. b Precision and recall when predicting SJCRH samples. Validation results for subclass calls are in red. Validation results for family calls are in blue. Each point shows the precision and proportion of calls at each classification probabilistic threshold ranging from 0 to 0.9 with 0.1 increments.
Fig. 3
Fig. 3. RFmod, kNNmod, and NNmod classification scores when predicting independent testing samples having all the probes versus samples having 10% of probes randomly dropped.
Line plots showing prediction scores for (a) methylation family and (b) methylation class of GSE109379 data set. Line plots showing (c) methylation family prediction scores and (d) methylation class prediction scores of SJCRH data set. Linear regression lines and the R-squared goodness-of-fit measures were estimated using the scores produced from kNNmod (red), NNmod (green), and RFmod (blue).
Fig. 4
Fig. 4. Average precision and recall of each classifier at different purity fractions per 0 and 0.9 threshold.
Tumor samples from GSE109379 were mixed with control samples (as indicated in Supplemental Table 2) to create different fractions of normal vs tumor mixture. The average precision and recall for predicting methylation classes and families by RFmod (green), kNNmod (purple), and NNmod (orange) were computed for different mixed fractions of GSE109379 (0 to 0.95 purity—points on the lines) at 0 and 0.9 threshold.
Fig. 5
Fig. 5. Classification results of RF, kNN, and NN model for high-grade diffused midline glioblastoma with K-27 mutant (DMG, K27) methylation class at different contamination levels.
a, d, g Density plots of all calls (blue curve) and calls over the 0.9 clinical threshold (orange curve) at each possible methylation family predicted by RF, kNN, and NN when the ground truth is DMG, K27 at different fractions of control tissue contamination. b, e, h Box plots show the score distribution for each methylation family predicted by RF, kNN, and NN models. c, f, i Prediction accuracy of each classifier at each purity level.
Fig. 6
Fig. 6. Classification results of RF, kNN, and NN model for grade 4 glioblastoma, IDH wildtype, H3.3 G34 mutant (GBM, G34) methylation class at different contamination levels.
a, d, g Density plots of all calls (blue curve) and calls over the 0.9 clinical threshold (orange curve) at each possible methylation family predicted by RF, kNN, and NN when the ground truth is GBM, G34 at different fractions of control tissue contamination. b, e, h Box plots show the score distribution for each methylation family predicted by RF, kNN, and NN models. c, f, i Prediction accuracy of each classifier at each purity level.

References

    1. Moore, L. D., Le, T. & Fan, G. DNA methylation and its basic function. Neuropsychopharmacology38, 23–38 (2013). - PMC - PubMed
    1. Pelizzola, M. & Ecker, J. R. The DNA methylome. FEBS Lett.585, 1994–2000 (2011). - PMC - PubMed
    1. Sharp, A. J. et al. DNA methylation profiles of human active and inactive X chromosomes. Genome Res.21, 1592–1600 (2011). - PMC - PubMed
    1. Sriraman, A., Debnath, T. K., Xhemalce, B. & Miller, K. M. Making it or breaking it: DNA methylation and genome integrity. Essays Biochem.64, 687–703 (2020). - PMC - PubMed
    1. Probst, A. V., Dunleavy, E. & Almouzni, G. Epigenetic inheritance during the cell cycle. Nat. Rev. Mol. Cell Biol.10, 192–206 (2009). - PubMed