Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2025 Mar;97(3):e70302.
doi: 10.1002/jmv.70302.

Application of Interpretable Machine Learning Models to Predict the Risk Factors of HBV-Related Liver Cirrhosis in CHB Patients Based on Routine Clinical Data: A Retrospective Cohort Study

Affiliations

Application of Interpretable Machine Learning Models to Predict the Risk Factors of HBV-Related Liver Cirrhosis in CHB Patients Based on Routine Clinical Data: A Retrospective Cohort Study

Wei Xia et al. J Med Virol. 2025 Mar.

Abstract

Chronic hepatitis B (CHB) infection represents a significant global public health issue, often leading to hepatitis B virus (HBV)-related liver cirrhosis (HBV-LC) with poor prognoses. Early identification of HBV-LC risk is essential for timely intervention. This study develops and compares nine machine learning (ML) models to predict HBV-LC risk in CHB patients using routine clinical and laboratory data. A retrospective analysis was conducted involving 777 CHB patients, with 50.45% (392/777) progressing to HBV-LC. Admission data consisted of 52 clinical and laboratory variables, with missing values addressed using multiple imputation. Feature selection utilized Least Absolute Shrinkage and Selection Operator (LASSO) regression and the Boruta algorithm, identifying 24 key variables. The evaluated ML models included XGBoost, logistic regression (LR), LightGBM, random forest (RF), AdaBoost, Gaussian naive Bayes (GNB), multilayer perceptron (MLP), support vector machine (SVM), and k-nearest neighbors (KNN). The data set was partitioned into an 80% training set (n = 621) and a 20% independent testing set (n = 156). Cross-validation (CV) facilitated hyperparameter tuning and internal validation of the optimal model. Performance metrics included the area under the receiver operating characteristic curve (AUC), Brier score, accuracy, sensitivity, specificity, and F1 score. The RF model demonstrated superior performance, with AUCs of 0.992 (training) and 0.907 (validation), while the reconstructed model achieved AUCs of 0.944 (training) and 0.945 (validation), maintaining an AUC of 0.863 in the testing set. Calibration curves confirmed a strong alignment between observed and predicted probabilities. Decision curve analysis indicated that the RF model provided the highest net benefit across threshold probabilities. The SHAP algorithm identified RPR, PLT, HBV DNA, ALT, and TBA as critical predictors. This interpretable ML model enhances early HBV-LC prediction and supports clinical decision-making in resource-limited settings.

Keywords: HBV‐LC; SHAP; chronic hepatitis B; machine learning; prediction model.

PubMed Disclaimer

References

    1. R. Li, M. Shen, J. J. Ong, et al., “Blueprint to Hepatitis B Elimination In China: A Modelling Analysis of Clinical Strategies,” JHEP Reports 5, no. 10 (2023): 100833.
    1. B. Songtanin, N. Chaisrimaneepan, R. Mendóza, and K. Nugent, “Burden, Outcome, and Comorbidities of Extrahepatic Manifestations in Hepatitis B Virus Infections,” Viruses 16, no. 4 (2024): 618.
    1. J. P. Iredale, A. Pellicoro, and J. A. Fallowfield, “Liver Fibrosis: Understanding the Dynamics of Bidirectional Wound Repair to Inform the Design of Markers and Therapies,” Digestive Diseases 35, no. 4 (2017): 310–313.
    1. E. B. Tapper and N. D. Parikh, “Diagnosis and Management of Cirrhosis and Its Complications: A Review,” Journal of the American Medical Association 329, no. 18 (2023): 1589–1602.
    1. P. Ginès, A. Krag, J. G. Abraldes, E. Solà, N. Fabrellas, and P. S. Kamath, “Liver Cirrhosis,” Lancet 398, no. 10308 (2021): 1359–1376.

LinkOut - more resources