A machine learning approach to personalized predictors of dyslipidemia: a cohort study
- PMID: 37799151
- PMCID: PMC10548235
- DOI: 10.3389/fpubh.2023.1213926
A machine learning approach to personalized predictors of dyslipidemia: a cohort study
Abstract
Introduction: Mexico ranks second in the global prevalence of obesity in the adult population, which increases the probability of developing dyslipidemia. Dyslipidemia is closely related to cardiovascular diseases, which are the leading cause of death in the country. Therefore, developing tools that facilitate the prediction of dyslipidemias is essential for prevention and early treatment.
Methods: In this study, we utilized a dataset from a Mexico City cohort consisting of 2,621 participants, men and women aged between 20 and 50 years, with and without some type of dyslipidemia. Our primary objective was to identify potential factors associated with different types of dyslipidemia in both men and women. Machine learning algorithms were employed to achieve this goal. To facilitate feature selection, we applied the Variable Importance Measures (VIM) of Random Forest (RF), XGBoost, and Gradient Boosting Machine (GBM). Additionally, to address class imbalance, we employed Synthetic Minority Over-sampling Technique (SMOTE) for dataset resampling. The dataset encompassed anthropometric measurements, biochemical tests, dietary intake, family health history, and other health parameters, including smoking habits, alcohol consumption, quality of sleep, and physical activity.
Results: Our results revealed that the VIM algorithm of RF yielded the most optimal subset of attributes, closely followed by GBM, achieving a balanced accuracy of up to 80%. The selection of the best subset of attributes was based on the comparative performance of classifiers, evaluated through balanced accuracy, sensitivity, and specificity metrics.
Discussion: The top five features contributing to an increased risk of various types of dyslipidemia were identified through the machine learning technique. These features include body mass index, elevated uric acid levels, age, sleep disorders, and anxiety. The findings of this study shed light on significant factors that play a role in dyslipidemia development, aiding in the early identification, prevention, and treatment of this condition.
Keywords: Mexico City; Tlalpan 2020 cohort; feature selection; hypercholesterolemia; hypertriglyceridemia; hypoalphalipoproteinemia; machine learning; mixed hyperlipidemias.
Copyright © 2023 Gutiérrez-Esparza, Pulido, Martínez-García, Ramírez-delReal, Groves-Miralrio, Márquez-Murillo, Amezcua-Guerra, Vargas-Alarcón and Hernández-Lemus.
Conflict of interest statement
The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.
Figures
Similar articles
-
Tlalpan 2020 Case Study: Enhancing Uric Acid Level Prediction with Machine Learning Regression and Cross-Feature Selection.Nutrients. 2025 Mar 17;17(6):1052. doi: 10.3390/nu17061052. Nutrients. 2025. PMID: 40292490 Free PMC article.
-
Predicting dyslipidemia in Chinese elderly adults using dietary behaviours and machine learning algorithms.Public Health. 2025 Jan;238:274-279. doi: 10.1016/j.puhe.2024.12.025. Epub 2024 Dec 19. Public Health. 2025. PMID: 39706104
-
Prediction of metabolic and pre-metabolic syndromes using machine learning models with anthropometric, lifestyle, and biochemical factors from a middle-aged population in Korea.BMC Public Health. 2022 Apr 6;22(1):664. doi: 10.1186/s12889-022-13131-x. BMC Public Health. 2022. PMID: 35387629 Free PMC article.
-
Developing machine learning-based models to predict intrauterine insemination (IUI) success by address modeling challenges in imbalanced data and providing modification solutions for them.BMC Med Inform Decis Mak. 2022 Sep 1;22(1):228. doi: 10.1186/s12911-022-01974-8. BMC Med Inform Decis Mak. 2022. PMID: 36050710 Free PMC article.
-
Closing the Gaps in Care of Dyslipidemia: Revolutionizing Management with Digital Health and Innovative Care Models.Rev Cardiovasc Med. 2023 Dec 13;24(12):350. doi: 10.31083/j.rcm2412350. eCollection 2023 Dec. Rev Cardiovasc Med. 2023. PMID: 39077078 Free PMC article. Review.
Cited by
-
Predicting dyslipidemia incidence: unleashing machine learning algorithms on Lifestyle Promotion Project data.BMC Public Health. 2024 Jul 3;24(1):1777. doi: 10.1186/s12889-024-19261-8. BMC Public Health. 2024. PMID: 38961394 Free PMC article.
-
Determination of Lifestyle Habits Correlating to the Prevalence of Hypertension, Diabetes, and Dyslipidemia by the Analysis of Health-Related Questionnaire Datasets in Japanese Nationwide Open Data.Cureus. 2025 Jan 7;17(1):e77105. doi: 10.7759/cureus.77105. eCollection 2025 Jan. Cureus. 2025. PMID: 39917101 Free PMC article.
-
Predictive value of anthropometric indices for incident of dyslipidemia: a large population-based study.Popul Health Metr. 2025 Aug 19;23(1):48. doi: 10.1186/s12963-025-00410-z. Popul Health Metr. 2025. PMID: 40830804 Free PMC article.
-
Nanotechnology and Artificial Intelligence in Dyslipidemia Management-Cardiovascular Disease: Advances, Challenges, and Future Perspectives.J Clin Med. 2025 Jan 29;14(3):887. doi: 10.3390/jcm14030887. J Clin Med. 2025. PMID: 39941558 Free PMC article. Review.
-
Development and evaluation of a machine learning model for osteoporosis risk prediction in Korean women.BMC Womens Health. 2025 Mar 28;25(1):146. doi: 10.1186/s12905-025-03669-4. BMC Womens Health. 2025. PMID: 40155887 Free PMC article.
References
-
- Furgione A, Sánchez D, Scott G, Luti Y, Arraiz N, Bermúdez V, et al. Dislipidemias primarias como factor de riesgo para la enfermedad coronaria. Rev Latinoamericana Hipertensión. (2009) 4:18–25.
Publication types
MeSH terms
LinkOut - more resources
Full Text Sources
Miscellaneous