Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2025 Oct 16;110(11):e3866-e3877.
doi: 10.1210/clinem/dgaf111.

Machine Learning-Based Biomarker Identification for Early Diagnosis of Metabolic Dysfunction-Associated Steatotic Liver Disease

Affiliations

Machine Learning-Based Biomarker Identification for Early Diagnosis of Metabolic Dysfunction-Associated Steatotic Liver Disease

Jolie Boullion et al. J Clin Endocrinol Metab. .

Abstract

Context: Metabolic dysfunction-associated steatotic liver disease (MASLD) is an umbrella term for simple hepatic steatosis and the more severe metabolic dysfunction-associated steatohepatitis. The current reliance on liver biopsy for diagnosis and a lack of validated biomarkers are major factors contributing to the overall burden of MASLD.

Objective: This study investigates the association between biomarkers and hepatic steatosis and stiffness measurements, measured by FibroScan®.

Methods: Data from the National Health and Nutritional Examination Survey (2017-2020) were collected for 15 560 patients. Propensity score matching balanced the data with a 1:1 case to control for age and sex allowing for preliminary trend assessment. Random Forest machine learning determined variable importance for the incorporation of key biomarkers (age, sex, race, BMI, HbA1c, plasma fasting glucose, insulin, total cholesterol, LDL-cholesterol, HDL-cholesterol, triglycerides, ALT, AST, ALP, albumin, GGT, LDH, iron, total bilirubin, total protein, uric acid, BUN, and hs-CRP) into logistic regression models predicting steatosis (MASLD indicated by a controlled attenuation parameter score of ≥238 dB/m) and stiffness (hepatic fibrosis indicated by a median liver stiffness ≥7 kPa). Sensitivity analysis using XGBoost and Recursive Feature Elimination was performed.

Results: The Random Forest models (the most accurate) predicted MASLD with 79.59% accuracy (P < .001) and specificity of 84.65% and predicted hepatic fibrosis with 86.07% accuracy (P < .001) and sensitivity of 98.01%. Both the steatosis and stiffness models identified statistically significant biomarkers, with age, BMI, and insulin appearing significant to both.

Conclusion: These findings indicate that assessing a variety of biomarkers, across demographic, metabolic, lipid, and standard biochemistry categories, may provide valuable initial insights for diagnosing patients for MASLD and hepatic fibrosis.

Keywords: MASLD; biomarker; hepatic fibrosis.

PubMed Disclaimer

Figures

Figure 1.
Figure 1.
Flowchart explaining data collection, feature selection, and statistical analyses.

Comment in

References

    1. Powell EE, Wong VW, Rinella M. Non-alcoholic fatty liver disease. Lancet. 2021;397(10290):2212‐2224. - PubMed
    1. Kasper P, Martin A, Lang S, et al. NAFLD and cardiovascular diseases: a clinical review. Clin Res Cardiol. 2021;110(7):921‐937. - PMC - PubMed
    1. Younossi Z, Anstee QM, Marietti M, et al. Global burden of NAFLD and NASH: trends, predictions, risk factors and prevention. Nat Rev Gastroenterol Hepatol. 2018;15(1):11‐20. - PubMed
    1. Finney AC, Das S, Kumar D, et al. The interplay between nonalcoholic fatty liver disease and atherosclerotic cardiovascular disease. Front Cardiovasc Med. 2023;10:1116861. - PMC - PubMed
    1. Nonalcoholic Fatty liver Disease (NAFLD) . American Liver Foundation. Updated October 24, 2024. Accessed October 30, 2024. https://liverfoundation.org/liver-diseases/fatty-liver-disease/nonalcoho...