Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2022 Sep 29;8(10):e10772.
doi: 10.1016/j.heliyon.2022.e10772. eCollection 2022 Oct.

Prediction of hepatocellular carcinoma risk in patients with type-2 diabetes using supervised machine learning classification model

Affiliations

Prediction of hepatocellular carcinoma risk in patients with type-2 diabetes using supervised machine learning classification model

Noor Atika Azit et al. Heliyon. .

Abstract

Background: Hepatocellular carcinoma (HCC) among type-2 diabetes (T2D) patients is an increasing burden to diabetes management. This study aims to develop and select the best machine learning (ML) classification model for predicting HCC in T2D for HCC early detection.

Methods: A case-control study was conducted utilising computerised medical records in two hepatobiliary centres. The predictors were chosen using multiple logistic regression. IBM SPSS Modeler® was used to assess the discriminative performance of support vector machine (SVM), logistic regression (LR), artificial neural network (ANN), chi-square automatic interaction detection (CHAID), and their ensembles.

Results: Subjects (N = 424) were split into 60% training (n = 248) and 40% testing (n = 176) groups. The independent predictors identified were race, viral hepatitis, abdominal pain/discomfort, unintentional weight loss, statins, alcohol consumption, non-alcoholic fatty liver, platelet <150 ×103/μL, alkaline phosphatase >129 IU/L, and alanine transaminase ≥25 IU/L. The performances of all models differed significantly (Cochran's Q-test,p = 0.001) but not between the ensembled and SVM model (McNemar test, p = 0.687). SVM model was selected as the best model due to its simplicity, high accuracy (85.28%), and high AUC (0.914). A web-based application was developed using the best model's algorithm for HCC prediction.

Conclusions: If further validation studies confirm these results, the SVM model's application potentially augments early HCC detection in T2D patients.

Keywords: Diabetes; Hepatocellular carcinoma; Machine learning; Risk prediction; Support vector machine.

PubMed Disclaimer

Conflict of interest statement

The authors declare no conflict of interest.

Figures

Figure 1
Figure 1
SPSS Modeler stream. Dataset file containing all the variables. In data processing, a type node was used to select the variables and to assign the appropriate categories. Data audit node was used to visualise the selected variables distribution and the validity of each variable. A SetToFlag node was selected for feature engineering, which involves converting nominal variables into categorical variables: “yes or no”. The transformed data were re-analysed using the data audit node.
Figure 2
Figure 2
The characteristics of target and input variables included in the models. HCC status is the target variable, with the other 12 input variables. All were in the flag (yes/no) measurement. The graph colour in red indicates the proportion of variables with HCC = yes (1). No missing values for each variable. There was no significant different between training and testing set (p-value <0.05).
Figure 3
Figure 3
Predictor’s importance showing the relative contribution of each variable towards the model algorithm is presented as follows: a) LR-all input variables were included in the model with viral hepatitis contributing the most, b) ANN-viral hepatitis contributed the most to this model while ALP contributed the least c) SVM-all variables were included, with viral hepatitis contributing most to the models and d) CHAID models – only six variables were selected by the model out of 12 input variables in the final model, with viral hepatitis contributing the most.
Figure 4
Figure 4
a) The web-based application with an example of the absence of any risk factors in an Indian patient. b) The HCC risk estimation in the presence of all the risk factors in an Indian patient.
Figure 5
Figure 5
Patients in the T2D clinic who underwent routine check-ups and blood investigation will be assessed for HCC risk using the web-based HCC risk predictor. Patients who had been predicted for HCC need to be referred for further assessment including hepatobiliary imaging such as ultrasound. Those who had not been predicted will be assessed again in the next routine blood investigation.

References

    1. International Diabetes Federation (IDF) 2019. IDF Diabetes Atlas Ninth Edition.https://www.idf.org/e-library/epidemiology-research/diabetes-atlas/159-i... Available online: - PubMed
    1. Sung H., Ferlay J., Siegel R.L., Laversanne M., Soerjomataram I., Jemal A., Bray F. Global cancer statistics 2020: GLO-BOCAN estimates of incidence and mortality worldwide for 36 cancers in 185 countries. CA A Cancer J. Clin. 2021;71:209–249. - PubMed
    1. Kyu H.H., Abate D., Abate K.H., Abay S.M., Abbafati C., Abbasi N., Abbastabar H., Abd-Allah F., Abdela J., Abdelalim A., Abdollahpour I. Global, regional, and national disability-adjusted life-years (DALYs) for 359 diseases and injuries and healthy life expectancy (HALE) for 195 countries and territories, 1990–2017: a systematic analysis for the Global Burden of Disease Study 2017. Lancet. 2018;392:1859–1922. - PMC - PubMed
    1. Sun J., Althoff K.N., Jing Y., Horberg M.A., Buchacz K., Gill M.J., Justice A.C., Rabkin C.S., Goedert J.J., Sigel K., Cachay E. Trends in hepatocellular carcinoma incidence and risk among persons with HIV in the US and Canada, 1996-2015. JAMA Netw. Open. 2021;4:e2037512. - PMC - PubMed
    1. Wang P., Kang D., Cao W., Wang Y., Liu Z. Diabetes mellitus and risk of hepatocellular carcinoma: a systematic review and meta-analysis. Diabetes Metab Res Rev. 2012;28:109–122. - PubMed

LinkOut - more resources