Predicting no-shows at outpatient appointments in internal medicine using machine learning models
- PMID: 40567710
- PMCID: PMC12190658
- DOI: 10.7717/peerj-cs.2762
Predicting no-shows at outpatient appointments in internal medicine using machine learning models
Abstract
The high prevalence of patient absenteeism in medical appointments poses significant challenges for healthcare providers and patients, causing delays in service delivery and increasing operational inefficiencies. Addressing this issue is crucial in the internal medicine department, a fundamental pillar of comprehensive adult healthcare that manages various chronic and complex conditions. To mitigate absenteeism, we present an innovative application of machine learning models specifically designed to predict the risk of patient absenteeism in the internal medicine department of Fundación Valle del Lili, a high-complexity hospital in Colombia. Leveraging an institutional database, we conducted a statistical analysis to identify critical variables influencing absenteeism risk, including clinical and sociodemographic factors and characteristics of previously attended appointments. Our study evaluated seven distinct machine learning models, explored various data processing techniques, and addressed class imbalance through oversampling and undersampling strategies. Hyperparameter optimization was conducted for each model configuration, culminating in selecting the Bagging RandomForest model, which demonstrated outstanding performance when combined with standardized data and balanced using the Synthetic Minority Oversampling Technique (SMOTE). Additionally, Shapley values (SHAP) were applied to enhance the interpretability of the model, enabling the identification of the most influential variables in predicting medical absenteeism, such as the number of previous absences, the day and month of the appointment, and diagnosed diseases. The selected model achieved a predictive accuracy of 84.80 ± 0.81%, an AUC value of 0.89, an F1-score of 84.75%, and a recall of 83.02% in cross-validation experiments. These results highlight the potential of our experimental approach to identify the most suitable model for proactively predicting patients at high risk of absenteeism, optimizing resource allocation, and improving the quality of medical care in internal medicine in the future. Our methodology provides a foundation for reducing operational inefficiencies and strengthening intervention strategies. This benefits healthcare providers and patients through more timely and effective care. Ultimately, this approach contributes to improving patient outcomes and institutional efficiency.
Keywords: Internal medicine; Machine learning; Medical appointments; No-shows; Non-attendance.
© 2025 Ocampo Osorio et al.
Conflict of interest statement
The authors declare that they have no competing interests.
Figures







Similar articles
-
Cost-effectiveness of using prognostic information to select women with breast cancer for adjuvant systemic therapy.Health Technol Assess. 2006 Sep;10(34):iii-iv, ix-xi, 1-204. doi: 10.3310/hta10340. Health Technol Assess. 2006. PMID: 16959170
-
Systemic pharmacological treatments for chronic plaque psoriasis: a network meta-analysis.Cochrane Database Syst Rev. 2021 Apr 19;4(4):CD011535. doi: 10.1002/14651858.CD011535.pub4. Cochrane Database Syst Rev. 2021. Update in: Cochrane Database Syst Rev. 2022 May 23;5:CD011535. doi: 10.1002/14651858.CD011535.pub5. PMID: 33871055 Free PMC article. Updated.
-
Signs and symptoms to determine if a patient presenting in primary care or hospital outpatient settings has COVID-19.Cochrane Database Syst Rev. 2022 May 20;5(5):CD013665. doi: 10.1002/14651858.CD013665.pub3. Cochrane Database Syst Rev. 2022. PMID: 35593186 Free PMC article.
-
Computer and mobile technology interventions for self-management in chronic obstructive pulmonary disease.Cochrane Database Syst Rev. 2017 May 23;5(5):CD011425. doi: 10.1002/14651858.CD011425.pub2. Cochrane Database Syst Rev. 2017. PMID: 28535331 Free PMC article.
-
Psychological therapies for treatment-resistant depression in adults.Cochrane Database Syst Rev. 2018 May 14;5(5):CD010558. doi: 10.1002/14651858.CD010558.pub2. Cochrane Database Syst Rev. 2018. PMID: 29761488 Free PMC article.
References
-
- Aldi F, Hadi F, Rahmi NA, Defit S. Standardscaler’s potential in enhancing breast cancer accuracy using machine learning. Journal of Applied Engineering and Technological Science. 2023;5(1):401–413. doi: 10.37385/jaets.v5i1.3080. - DOI
-
- Ampomah EK, Qin Z, Nyame G, Botchey FE. Stock market decision support modeling with tree-based adaboost ensemble machine learning models. Informatica. 2021;44(4):477–489. doi: 10.31449/inf.v44i4.3159. - DOI
-
- Ayyadevara VK. Gradient boosting machine. Berkeley, CA: Apress; 2018.
LinkOut - more resources
Full Text Sources