. 2026 Feb 17:14:1682879.

doi: 10.3389/fpubh.2026.1682879. eCollection 2026.

Value of an automated machine learning model with post-hoc explanation for predicting healthcare-seeking delays among residents in Tibetan regions

Zhenzhong Xi¹, Chenxing Meng¹, Qian Li², Yisha Xu³, Peng Wu⁴, Zhigang Zhang⁵, Tingyong Han⁶, Liangjie Zhang⁷, Xinxuan Han¹

Affiliations

¹ The 945th Hospital of the Joint Logistics Support Force, PLA, Ya'an, Sichuan, China.
² General Hospital of Western Theater Command, PLA, Chengdu, Sichuan, China.
³ Ya'an People's Hospital, Ya'an, Sichuan, China.
⁴ Yucheng District People's Hospital of Ya'an, Ya'an, Sichuan, China.
⁵ Mingshan District People's Hospital of Ya'an, Ya'an, Sichuan, China.
⁶ Affiliated Hospital of Ya'an Polytechnic College, Ya'an, Sichuan, China.
⁷ Ya'an Hospital of Traditional Chinese Medicine, Ya'an, Sichuan, China.

PMID: 41783714
PMCID: PMC12953569
DOI: 10.3389/fpubh.2026.1682879

Value of an automated machine learning model with post-hoc explanation for predicting healthcare-seeking delays among residents in Tibetan regions

Zhenzhong Xi et al. Front Public Health. 2026.

. 2026 Feb 17:14:1682879.

doi: 10.3389/fpubh.2026.1682879. eCollection 2026.

Authors

Zhenzhong Xi¹, Chenxing Meng¹, Qian Li², Yisha Xu³, Peng Wu⁴, Zhigang Zhang⁵, Tingyong Han⁶, Liangjie Zhang⁷, Xinxuan Han¹

Affiliations

¹ The 945th Hospital of the Joint Logistics Support Force, PLA, Ya'an, Sichuan, China.
² General Hospital of Western Theater Command, PLA, Chengdu, Sichuan, China.
³ Ya'an People's Hospital, Ya'an, Sichuan, China.
⁴ Yucheng District People's Hospital of Ya'an, Ya'an, Sichuan, China.
⁵ Mingshan District People's Hospital of Ya'an, Ya'an, Sichuan, China.
⁶ Affiliated Hospital of Ya'an Polytechnic College, Ya'an, Sichuan, China.
⁷ Ya'an Hospital of Traditional Chinese Medicine, Ya'an, Sichuan, China.

PMID: 41783714
PMCID: PMC12953569
DOI: 10.3389/fpubh.2026.1682879

Abstract

Objective: This study aimed to investigate key determinants of healthcare-seeking delays among Tibetan residents and develop predictive models using automated machine learning (AutoML) with post-hoc SHAP interpretation alongside a clinical decision support system.

Methods: Face-to-face surveys using structured questionnaires were administered to 1,879 Tibetan residents. Data processing employed an AutoML framework: datasets were partitioned into training (n = 1,503) and testing (n = 376) subsets at an 8:2 ratio. Standardized preprocessing-including outlier rectification, one-hot encoding (OHE), and random forest-based multiple imputation (MI)-was applied. Model validation integrated 5-fold cross-validation and SHapley Additive exPlanations (SHAP) analysis.

Results: Among 1,879 participants, the healthcare-seeking delay incidence was 41.99%. The LightGBM model significantly outperformed conventional approaches (AUC > 0.86). SHAP feature importance analysis revealed the predictor hierarchy: Age > County hospital quality score > Distance to county hospital > Township health center quality score > Able to communicate in Chinese.

Conclusion: A high-performance model with post-hoc SHAP interpretation accurately identifies geographical, cultural, and healthcare resource variables to accurately identify high-risk populations. The developed clinical decision support system enables risk computation through modular interfaces, providing an evidence-based tool for optimizing hierarchical diagnosis and resource allocation in Tibetan healthcare.

Keywords: Tibetan healthcare; automated machine learning; clinical decision support system; healthcare-seeking delay; interpretability analysis.

PubMed Disclaimer

Conflict of interest statement

The author(s) declared that this work was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Figures

**Figure 2**
Comparative evaluation of optimization efficacy in swarm intelligence algorithms. The horizontal axis denotes different CEC2022 benchmark functions (or function indices), while the vertical axis represents the best fitness values (objective function values) obtained over 30 independent algorithm executions. Lower values indicate superior optimization efficacy. Each box plot illustrates the statistical distribution of 30 runs for a given algorithm on a specific test function. Narrower interquartile ranges (IQR) and whisker spans (extending to ±1.5 × IQR) reflect enhanced algorithmic stability. As visually evident, IETO exhibits markedly more compact distributions (lower median positions and reduced IQRs) than ETO, WOA, and PSO across most functions, validating its superior solution quality and robust convergence behavior.

**Figure 3**
Comparative analysis of convergence dynamics in swarm intelligence algorithms. The horizontal axis indicates iteration counts (1–500), and the vertical axis quantifies either the contemporary population mean fitness or best fitness values. Lower values denote higher solution quality. Each trajectory depicts evolutionary fitness progression for one algorithm. Accelerated initial descent rates characterize rapid convergence, while sustained declines toward lower asymptotic plateaus signify enhanced capability for escaping local optima. The IETO trajectory (distinguished by [specified line style/color]) demonstrates the steepest initial convergence gradient and achieves the lowest terminal values across functions, confirming its dual proficiency in swift convergence and global exploration efficacy.

**Figure 4**
Cross-validation performance of the training set. **(A)** ROC curve of training set; **(B)** PR curve of training set.

**Figure 5**
Cross-validation performance of the testing set. **(A)** ROC curve of testing set; **(B)** PR curve of testing set.

**Figure 6**
Decision curve analysis of the prediction model **(A)** training set and **(B)** testing set. Note: The Y-axis shows the net benefit, the realization represents the prediction model, the red dashed line represents the assumption that all patients develop delays, and the black dashed line represents the assumption that no patients develop delays.

**Figure 7**
Machine learning interpretability analysis. **(A)** The Shapley summary plot comprehensively presents the overall impact patterns of various features on model predictions across all samples. Each point in the plot represents a feature and its SHAP value (i.e., the feature’s contribution to prediction) for a specific sample. The color of the points indicates the actual value magnitude of the feature (yellow for high values, blue for low values), while the distribution along the horizontal axis (SHAP values) reflects how feature values influence predictions (positive values increase predictions, negative values decrease them). This visualization allows intuitive identification of which features generally correlate with increases or decreases in predicted values, as well as trend relationships between feature influence and feature magnitudes; **(B)** The Shapley feature importance plot displays the overall importance ranking of each feature’s impact on model predictions in bar chart form. Feature importance is determined by calculating the mean absolute SHAP value for each feature across all samples, thereby measuring its average contribution to model output variations. Longer bars indicate greater influence of the feature in the model’s overall decision-making process, providing researchers with clear insight into the most critical factors driving predictions; **(C–E)** Waterfall plots illustrate the cumulative contribution process of each feature to individual patient predictions. The baseline value represents the model’s average prediction for all patients, while feature contributions show how each feature affects the final prediction (red indicating increased risk, blue indicating decreased risk). The sum of all feature contributions yields the final predicted value; **(F)** The decision path plot compares decision pathways across multiple patients, demonstrating how different feature combinations lead to varying prediction outcomes. The horizontal axis shows predicted probabilities, the vertical axis lists features, and the curved pathways trace decision routes from baseline values to final predictions; **(G–I)** Force plots visually demonstrate how each feature “pushes” predictions toward higher or lower risk directions. Red arrows indicate features pushing predictions toward higher risk, blue arrows indicate features pushing toward lower risk, with arrow length representing the magnitude of influence; HosQuality: County hospital quality score; Distance: Distance to county hospital; TownQuality: Township health center quality score; Chinese: Able to communicate in Chinese.

**Figure 8**
SHAP interaction analysis between key indicators.

**Figure 9**
Demonstration of clinical decision support system.

See this image and copyright information in PMC

References

1. Guo W, Chen QW, Yan JX. Clinical application of RigiScan monitoring in the diagnosis and treatment of erectile dysfunction in the plateau area. Zhonghua Nan Ke Xue. (2020) 26:522–7. - PubMed
1. Yang Y, Cheng J, Peng Y, Luo Y, Zou D, Yang Y, et al. Clinical features of patients with cerebral venous sinus thrombosis at plateau areas. Brain Behav. (2023) 13:e2998. doi: 10.1002/brb3.2998, - DOI - PMC - PubMed
1. Ehsanul Huq KATM, Moriyama M, Zaman K, Chisti MJ, Long J, Islam A, et al. Health seeking behaviour and delayed management of tuberculosis patients in rural Bangladesh. BMC Infect Dis. (2018) 18:515. doi: 10.1186/s12879-018-3430-0, - DOI - PMC - PubMed
1. Dehdar S, Salimifard K, Mohammadi R, Marzban M, Saadatmand S, Fararouei M, et al. Applications of different machine learning approaches in prediction of breast cancer diagnosis delay. Front Oncol. (2023) 13:1103369. doi: 10.3389/fonc.2023.1103369, - DOI - PMC - PubMed
1. Zhang B, Sun Q, Lv Y, Sun T, Zhao W, Yan R, et al. Influencing factors for decision-making delay in seeking medical care among acute ischemic stroke patients in rural areas. Patient Educ Couns. (2023) 108:107614. doi: 10.1016/j.pec.2022.107614, - DOI - PubMed

MeSH terms

Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions

LinkOut - more resources

Full Text Sources
- Frontiers Media SA
- PubMed Central

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Value of an automated machine learning model with post-hoc explanation for predicting healthcare-seeking delays among residents in Tibetan regions

Affiliations

Value of an automated machine learning model with post-hoc explanation for predicting healthcare-seeking delays among residents in Tibetan regions

Authors

Affiliations

Abstract

Conflict of interest statement

Figures

References

MeSH terms

LinkOut - more resources

Full Text Sources