Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2023 Sep 22;13(1):15772.
doi: 10.1038/s41598-023-41353-5.

Texture feature analysis of MRI-ADC images to differentiate glioma grades using machine learning techniques

Affiliations

Texture feature analysis of MRI-ADC images to differentiate glioma grades using machine learning techniques

Sahan M Vijithananda et al. Sci Rep. .

Abstract

Apparent diffusion coefficient (ADC) of magnetic resonance imaging (MRI) is an indispensable imaging technique in clinical neuroimaging that quantitatively assesses the diffusivity of water molecules within tissues using diffusion-weighted imaging (DWI). This study focuses on developing a robust machine learning (ML) model to predict the aggressiveness of gliomas according to World Health Organization (WHO) grading by analyzing patients' demographics, higher-order moments, and grey level co-occurrence matrix (GLCM) texture features of ADC. A population of 722 labeled MRI-ADC brain image slices from 88 human subjects was selected, where gliomas are labeled as glioblastoma multiforme (WHO-IV), high-grade glioma (WHO-III), and low-grade glioma (WHO I-II). Images were acquired using 3T-MR systems and a region of interest (ROI) was delineated manually over tumor areas. Skewness, kurtosis, and statistical texture features of GLCM (mean, variance, energy, entropy, contrast, homogeneity, correlation, prominence, and shade) were calculated using ADC values within ROI. The ANOVA f-test was utilized to select the best features to train an ML model. The data set was split into training (70%) and testing (30%) sets. The train set was fed into several ML algorithms and selected most promising ML algorithm using K-fold cross-validation. The hyper-parameters of the selected algorithm were optimized using random grid search technique. Finally, the performance of the developed model was assessed by calculating accuracy, precision, recall, and F1 values reported for the test set. According to the ANOVA f-test, three attributes; patient gender (1.48), GLCM energy (9.48), and correlation (13.86) that performed minimum scores were excluded from the dataset. Among the tested algorithms, the random forest classifier(0.8772 ± 0.0237) performed the highest mean-cross-validation score and selected to build the ML model which was able to predict tumor categories with an accuracy of 88.14% over the test set. The study concludes that the developed ML model using the above features except for patient gender, GLCM energy, and correlation, has high prediction accuracy in glioma grading. Therefore, the outcomes of this study enable to development of advanced tumor classification applications that assist in the decision-making process in a real-time clinical environment.

PubMed Disclaimer

Conflict of interest statement

The authors declare no competing interests.

Figures

Figure 1
Figure 1
Application of supervised learning method for multiclass tumor classification problem. The flow chart illustrates the steps followed in developing a multiclass classification model. After identification of the nature of the problem, the necessary MRI and histopathology data are collected. At the data pre-processing step, the texture features are extracted from MRI images, and the extracted data is prepared to be compatible with training the machine learning model (data labeling, removing defected data, binarization). The next step splits the dataset into train and test sets. The most promising machine learning algorithm for the dataset is selected and fed to the algorithm with the training set to build the classification model. Finally, the performance of the developed model is assessed. When the performance did not meet the required level of performance, the hyperparameters of the developed model are tuned and find the most suitable combination of hyperparameters. Also, sometimes it is necessary to revise the data collection, data pre-processing, and repeat training and testing steps until meeting the required performance of the model.
Figure 2
Figure 2
ANOVA F-test results in a bar chart. The figure illustrates the bar chart of the feature importance scores for each input feature. The standardized features; mean ADC, skewness, kurtosis, GLCM mean 1, GLCM mean 2, GLCM variance 1, GLCM variance 2, energy, entropy, contrast, homogeneity, correlation, prominence, shade patient’ age, and gender are indicated by 0 to 15 numbers in the bar chart respectively.
Figure 3
Figure 3
Multiclass receiver operating characteristic (ROC) curve for the base model. The ROC curve illustrates the trade-off between true positives and false positives that reflects the performance of the classification model at various threshold settings. The performance of multiclass classification models is displayed in ROC curve s using the one vs rest technique. Class 0, 1, and 2 represent glioblastoma, high-grade glioma, and low-grade glioma, respectively. The area under the curve (AUC) for each curve; yellow: 0.9434, green: 0.9521, and blue: 0.9885.
Figure 4
Figure 4
Multiclass receiver operating characteristic (ROC) curve for the base model after hyperparameter tuning. The performance of tuned multiclass classification models is displayed in ROC curves using the one vs rest technique. Class 0, 1, and 2 represent glioblastoma, high-grade glioma, and low-grade glioma respectively. The area under the curve (AUC) for each curve; yellow: 0.9525, green: 0.9545, and blue: 0.9901.
Figure 5
Figure 5
Confusion matrix illustrating the performance of the tuned classification model. According to the confusion matrix, the tuned model predicted 112 out of 130 cases of glioblastoma multiforme (GBM), 109 out of 129 cases of high-grade glioma (HGG), and 121 out of 129 cases of low-grade glioma (LGG).
Figure 6
Figure 6
Apparent diffusion coefficient (ADC) images of gliomas. (A) An ADC brain image of a 62-year-old male patient presented with glioblastoma multiforme (GBM) (WHO grade IV). (B) ADC brain image of a 16-year-old male patient with Anaplastic oligodendroglioma (WHO III). (C) ADC brain image of a 39-years-old female patient presented with low grade (WHO II) glioma. (D) ADC brain image of a 49-years-old male patient with presented with a schwannoma (WHO I) (E–H) illustrate the region of interest drawn over the tumor areas of (A–D) images respectively.

Similar articles

Cited by

References

    1. Goodenberger ML, Jenkins RB. Genetics of adult glioma. Cancer Genet. 2012;205:613–621. - PubMed
    1. Wang X, et al. Machine learning models for multiparametric glioma grading with quantitative result interpretations. Front. Neurosci. 2019;12:1046. - PMC - PubMed
    1. Tessamma, T. & Ananda Resmi, S. Texture Description of Low Grade and High Grade Glioma Using Statistical Features in Brain MRIS (ACEEE, 2010).
    1. Zuckerkandl, E. & Pauling, L. Evolutionary divergence and convergence in proteins. In: Evolving Genes and Proteins. 97–166 (Elsevier, 1965).
    1. Ostrom, Q. T. et al. Cbtrus statistical report: Primary brain and central nervous system tumors diagnosed in the United States in 2007–2011. Neuro-oncology16, iv1–iv63 (2014). - PMC - PubMed

Publication types