Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2022 Apr 4:10:846118.
doi: 10.3389/fpubh.2022.846118. eCollection 2022.

A Machine Learning Based Framework to Identify and Classify Non-alcoholic Fatty Liver Disease in a Large-Scale Population

Affiliations

A Machine Learning Based Framework to Identify and Classify Non-alcoholic Fatty Liver Disease in a Large-Scale Population

Weidong Ji et al. Front Public Health. .

Abstract

Non-alcoholic fatty liver disease (NAFLD) is a common serious health problem worldwide, which lacks efficient medical treatment. We aimed to develop and validate the machine learning (ML) models which could be used to the accurate screening of large number of people. This paper included 304,145 adults who have joined in the national physical examination and used their questionnaire and physical measurement parameters as model's candidate covariates. Absolute shrinkage and selection operator (LASSO) was used to feature selection from candidate covariates, then four ML algorithms were used to build the screening model for NAFLD, used a classifier with the best performance to output the importance score of the covariate in NAFLD. Among the four ML algorithms, XGBoost owned the best performance (accuracy = 0.880, precision = 0.801, recall = 0.894, F-1 = 0.882, and AUC = 0.951), and the importance ranking of covariates is accordingly BMI, age, waist circumference, gender, type 2 diabetes, gallbladder disease, smoking, hypertension, dietary status, physical activity, oil-loving and salt-loving. ML classifiers could help medical agencies achieve the early identification and classification of NAFLD, which is particularly useful for areas with poor economy, and the covariates' importance degree will be helpful to the prevention and treatment of NAFLD.

Keywords: LASSO; machine learning; non-alcoholic fatty liver disease (NAFLD); predictive models; screening model.

PubMed Disclaimer

Conflict of interest statement

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Figures

Figure 1
Figure 1
Machine learning flowchart of this study. LR, logistic regression; RF, random forest; NB, Naive Bayesian; ML, machine learning; LASSO, least absolute shrinkage and selection operator.
Figure 2
Figure 2
Lasso algorithm for feature selection. (A) mean-squared error (10-fold cross-validation criterion) of LASSO penalized logistic regression algorithm. (B) Vertical line was drawn at the value selected using 10 times cross-validation, where optimal lambda resulted in 12 features with nonzero coefficients.
Figure 3
Figure 3
ROC curve of all algorithms. LR, logistic regression; RF, random forest; NB, Naive Bayesian; XGB, XGBoost.
Figure 4
Figure 4
Feature importance contributed to the XGBoost model measured by F-score.

Similar articles

Cited by

References

    1. Younossi ZM, Koenig AB, Abdelatif D, Fazel Y, Henry L, Wymer M. Global epidemiology of nonalcoholic fatty liver disease-Meta-analytic assessment of prevalence, incidence, and outcomes. Hepatology. (2016) 64:73–84. 10.1002/hep.28431 - DOI - PubMed
    1. Rinella ME. Nonalcoholic fatty liver disease: a systematic review. JAMA. (2015) 313:2263–73. 10.1001/jama.2015.5370 - DOI - PubMed
    1. Wesolowski SR, Kasmi KC, Jonscher KR, Friedman JE. Developmental origins of NAFLD: a womb with a clue. Nat Rev Gastroenterol Hepatol. (2017) 14:81–96. 10.1038/nrgastro.2016.160 - DOI - PMC - PubMed
    1. Bellentani S, Scaglioni F, Marino M, Bedogni G. Epidemiology of non-alcoholic fatty liver disease. Dig Dis. (2010) 28:155–61. 10.1159/000282080 - DOI - PubMed
    1. Marengo A, Rosso C, Bugianesi E. Liver cancer: connections with obesity, fatty liver, and cirrhosis. Annu Rev Med. (2016) 67:103–17. 10.1146/annurev-med-090514-013832 - DOI - PubMed

Publication types