Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2025 Dec;57(1):2477294.
doi: 10.1080/07853890.2025.2477294. Epub 2025 Mar 19.

Identifying liver cirrhosis in patients with chronic hepatitis B: an interpretable machine learning algorithm based on LSM

Affiliations

Identifying liver cirrhosis in patients with chronic hepatitis B: an interpretable machine learning algorithm based on LSM

Xueting Bai et al. Ann Med. 2025 Dec.

Abstract

Background: Chronic hepatitis B (CHB) is a common cause of liver cirrhosis (LC), a condition associated with an unfavourable prognosis. Therefore, timely diagnosis of LC in CHB patients is crucial.

Objective: This study aimed to enhance the diagnostic accuracy of LC in CHB patients by integrating liver stiffness measurement (LSM) with traditional indicators.

Methods: The study participants were randomly divided into training and internal validation sets. Employing the least absolute shrinkage and selection operator (LASSO) and random forest-recursive feature elimination (RF-RFE) for feature selection, we developed both traditional logistic regression and five machine learning models (k-nearest neighbors, random forest (RF), artificial neural network, support vector machine and eXtreme Gradient Boosting). Performance evaluation included receiver operating characteristic curves, calibration curves and decision curve analysis. Shapley additive explanations (SHAP) was employed to improve the interpretability of the optimal model.

Results: We retrospectively included 1609 patients with CHB, among whom 470 were diagnosed with cirrhosis. Cirrhosis was diagnosed based on histological confirmation or clinical assessment, supported by characteristic findings on abdominal ultrasound and corroborative evidence such as thrombocytopenia, varices or imaging from CT/MRI. In the internal validation, the RF model achieved an accuracy above 0.80 and an AUC above 0.80, with outstanding calibration ability and clinical net benefit. Additionally, the model exhibited excellent predictive performance in an independent external validation set. The SHAP analysis indicated that LSM contributed the most to the model. The model still showed strong discriminative power when using only LSM or traditional indicators alone.

Conclusions: Machine learning models, especially the RF model, can effectively identify LC in CHB patients. Integrating LSM with traditional indicators can enhance diagnostic performance.

Keywords: Chronic hepatitis B; diagnostic model; liver cirrhosis; liver stiffness measurement; machine learning.

Plain language summary

Liver cirrhosis (LC) is a common complication of chronic hepatitis B (CHB).The random forest (RF) model showed the best overall performance to identify LC in CHB patients in our study, which could assist in the clinical decision-making procedure.Integrating LSM with traditional indicators can enhance the diagnostic performance of LC in CHB patients. In the absence of LSM, other traditional indicators can also diagnose LC effectively.

PubMed Disclaimer

Conflict of interest statement

No potential conflict of interest was reported by the author(s).

Figures

Figure 1.
Figure 1.
Flowchart. LR: logistic regression; XGBoost: eXtreme Gradient Boosting; ANN: artificial neural network; RF: random forest; KNN: k-nearest neighbors; SVM: support vector machine; ROC: receiver operating characteristic; DCA: decision curve analysis; SHAP: Shapley additive explanations.
Figure 2.
Figure 2.
Screening of characteristic factors. (A) Feature variables screening based on RF-RFE. (B) Feature variables screening based on LASSO. (C) LASSO combined RF-RFE. LASSO: least absolute shrinkage and selection operator; RF: random forest; RFE: recursive feature elimination.
Figure 3.
Figure 3.
ROC curves for the prediction models. (A) ROC curve in the training set. (B) ROC curve in the internal validation set. (C) ROC curve in the external validation set. ROC: receiver operating characteristic; AUC: area under the curve; ANN: artificial neural network; KNN: k-nearest neighbors; LR: logistic regression; RF: random Forest; SVM: support vector machine; XGBoost: eXtreme Gradient Boosting.
Figure 4.
Figure 4.
Confusion matrices of six models in the training and validation sets. (A) Confusion matrices in the training set. (B) Confusion matrices in the internal validation set. (C) Confusion matrices in the external validation set. LR: logistic regression; ANN: artificial neural network; SVM: support vector machine; RF: random Forest; KNN: k-nearest neighbors; XGBoost: eXtreme Gradient Boosting; LC: liver cirrhosis.
Figure 5.
Figure 5.
Calibrate curves for the prediction models. (A) Comprehensive summary figure of six models in the training set. (B) Comprehensive summary figure of six models in the internal validation set. (C) Comprehensive summary figure of six models in the external validation set. (D) Stratified plots of six models in the training set. (E) Stratified plots of six models in the internal validation set. (F) Stratified plots of six models in the external validation set. ANN: artificial neural network; KNN: k-nearest neighbors; LR: logistic regression; RF: random Forest; SVM: support vector machine; XGBoost: eXtreme Gradient Boosting.
Figure 6.
Figure 6.
DCA curves for the prediction models. (A) DCA curve in the training set. (B) DCA curve in the internal validation set. (C) DCA curve in the external validation set. The treat all curve represents the benefit rates for all cases with intervention, while the treat none curve represents the benefit rates for all cases without intervention. The remaining curves denote various models. The threshold probability represents the probability cut-off used to make a decision, while the net benefit indicates the clinical utility gained from using the model compared to alternative strategies. ANN: artificial neural network; KNN: k-nearest neighbors; LR: logistic regression; RF: random forest; SVM: support vector machine; XGBoost: eXtreme Gradient Boosting.
Figure 7.
Figure 7.
ROC curves for the RF model using LSM with traditional indicators, FIB-4, APRI and GPR. (A) ROC curve in the training set. (B) ROC curve in the internal validation set. (C) ROC curve in the external validation set. FIB-4: fibrosis-4 index; APRI: aspartate aminotransferase to platelet ratio index; GPR: the γ-glutamyl transferase-to-platelet ratio; LSM: liver stiffness measurement; Trad: 17 traditional indicators; ROC: receiver operating characteristic; AUC: area under the curve; RF: random forest.
Figure 8.
Figure 8.
ROC curves for the RF model based on the traditional indicators, LSM and traditional indicators, and LSM-only. (A) ROC curve in the training set. (B) ROC curve in the internal validation set. (C) ROC curve in the external validation set. LSM: liver stiffness measurement; Trad: 17 traditional indicators; ROC: receiver operating characteristic; AUC: area under the curve; RF: random forest.
Figure 9.
Figure 9.
ROC curves for the RF model based on FIB-4, LSM-only and 17 traditional indicators. (A) ROC curve in the training set. (B) ROC curve in the internal validation set. (C) ROC curve in the external validation set. FIB-4: fibrosis-4 index; LSM: liver stiffness measurement; Trad: 17 traditional indicators; ROC: receiver operating characteristic; AUC: area under the curve; RF: random Forest.
Figure 10.
Figure 10.
SHAP analysis based on the RF model. (A) Ranking of variable importance based on the mean SHAP value. (B) In the SHAP bee swarm plot, each row represents a feature, the x-axis represents the SHAP value, and each data point represents a sample. (C) SHAP analysis of liver cirrhosis in patients with chronic hepatitis B. (D) SHAP force plot of non-cirrhosis chronic hepatitis B patient.

Similar articles

Cited by

References

    1. Ginès P, Krag A, Abraldes JG, et al. . Liver cirrhosis. Lancet. 2021;398(10308):1359–1376. doi: 10.1016/S0140-6736(21)01374-X. - DOI - PubMed
    1. Shih C, Yang CC, Choijilsuren G, et al. . Hepatitis B virus. Trends Microbiol. 2018;26(4):386–387. doi: 10.1016/j.tim.2018.01.009. - DOI - PubMed
    1. Kisseleva T, Brenner D.. Molecular and cellular mechanisms of liver fibrosis and its regression. Nat Rev Gastroenterol Hepatol. 2021;18(3):151–166. doi: 10.1038/s41575-020-00372-7. - DOI - PubMed
    1. Jung YK, Yim HJ.. Reversal of liver cirrhosis: current evidence and expectations. Korean J Intern Med. 2017;32(2):213–228. doi: 10.3904/kjim.2016.268. - DOI - PMC - PubMed
    1. He ZY, Wang BQ, You H.. Reversal of cirrhotic decompensation: re-compensation. Zhonghua Gan Zang Bing Za Zhi. 2019;27(12):915–918. doi: 10.3760/cma.j.issn.1007-3418.2019.12.002. - DOI - PubMed

LinkOut - more resources