A Machine Learning Based Framework to Identify and Classify Non-alcoholic Fatty Liver Disease in a Large-Scale Population
- PMID: 35444985
- PMCID: PMC9013842
- DOI: 10.3389/fpubh.2022.846118
A Machine Learning Based Framework to Identify and Classify Non-alcoholic Fatty Liver Disease in a Large-Scale Population
Abstract
Non-alcoholic fatty liver disease (NAFLD) is a common serious health problem worldwide, which lacks efficient medical treatment. We aimed to develop and validate the machine learning (ML) models which could be used to the accurate screening of large number of people. This paper included 304,145 adults who have joined in the national physical examination and used their questionnaire and physical measurement parameters as model's candidate covariates. Absolute shrinkage and selection operator (LASSO) was used to feature selection from candidate covariates, then four ML algorithms were used to build the screening model for NAFLD, used a classifier with the best performance to output the importance score of the covariate in NAFLD. Among the four ML algorithms, XGBoost owned the best performance (accuracy = 0.880, precision = 0.801, recall = 0.894, F-1 = 0.882, and AUC = 0.951), and the importance ranking of covariates is accordingly BMI, age, waist circumference, gender, type 2 diabetes, gallbladder disease, smoking, hypertension, dietary status, physical activity, oil-loving and salt-loving. ML classifiers could help medical agencies achieve the early identification and classification of NAFLD, which is particularly useful for areas with poor economy, and the covariates' importance degree will be helpful to the prevention and treatment of NAFLD.
Keywords: LASSO; machine learning; non-alcoholic fatty liver disease (NAFLD); predictive models; screening model.
Copyright © 2022 Ji, Xue, Zhang, Yao and Wang.
Conflict of interest statement
The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.
Figures




Similar articles
-
Identification of Potential Type II Diabetes in a Large-Scale Chinese Population Using a Systematic Machine Learning Framework.J Diabetes Res. 2020 Sep 24;2020:6873891. doi: 10.1155/2020/6873891. eCollection 2020. J Diabetes Res. 2020. PMID: 33029536 Free PMC article.
-
Establishment of a machine learning predictive model for non-alcoholic fatty liver disease: A longitudinal cohort study.Nutr Metab Cardiovasc Dis. 2024 Jun;34(6):1456-1466. doi: 10.1016/j.numecd.2024.02.004. Epub 2024 Feb 15. Nutr Metab Cardiovasc Dis. 2024. PMID: 38508988
-
Comparison and development of advanced machine learning tools to predict nonalcoholic fatty liver disease: An extended study.Hepatobiliary Pancreat Dis Int. 2021 Oct;20(5):409-415. doi: 10.1016/j.hbpd.2021.08.004. Epub 2021 Aug 14. Hepatobiliary Pancreat Dis Int. 2021. PMID: 34420885
-
Non-Alcoholic Fatty Liver Disease Treatment in Patients with Type 2 Diabetes Mellitus; New Kids on the Block.Curr Vasc Pharmacol. 2020;18(2):172-181. doi: 10.2174/1570161117666190405164313. Curr Vasc Pharmacol. 2020. PMID: 30961499 Review.
-
Pediatric non-alcoholic fatty liver disease: Recent solutions, unresolved issues, and future research directions.World J Gastroenterol. 2016 Sep 28;22(36):8078-93. doi: 10.3748/wjg.v22.i36.8078. World J Gastroenterol. 2016. PMID: 27688650 Free PMC article. Review.
Cited by
-
Mitochondrial mt12361A>G increased risk of metabolic dysfunction-associated steatotic liver disease among non-diabetes.World J Gastroenterol. 2025 Mar 14;31(10):103716. doi: 10.3748/wjg.v31.i10.103716. World J Gastroenterol. 2025. PMID: 40093674 Free PMC article.
-
Artificial Intelligence in Identifying Patients With Undiagnosed Nonalcoholic Steatohepatitis.J Health Econ Outcomes Res. 2024 Sep 25;11(2):86-94. doi: 10.36469/001c.123645. eCollection 2024. J Health Econ Outcomes Res. 2024. PMID: 39351190 Free PMC article.
-
Application of Machine Learning Models in Predicting Non-Alcoholic Fatty Liver Disease Among Inactive Chronic Hepatitis B Patients: A Cross-Sectional Analysis.J Clin Med. 2025 Jul 16;14(14):5042. doi: 10.3390/jcm14145042. J Clin Med. 2025. PMID: 40725732 Free PMC article.
-
Development of a novel deep learning method that transforms tabular input variables into images for the prediction of SLD.Sci Rep. 2025 Jul 31;15(1):28024. doi: 10.1038/s41598-025-12900-z. Sci Rep. 2025. PMID: 40745379 Free PMC article.
-
Machine-Learning Algorithm for Predicting Fatty Liver Disease in a Taiwanese Population.J Pers Med. 2022 Jun 23;12(7):1026. doi: 10.3390/jpm12071026. J Pers Med. 2022. PMID: 35887527 Free PMC article.
References
Publication types
MeSH terms
LinkOut - more resources
Full Text Sources
Medical