Analysis of epidemiological association patterns of serum thyrotropin by combining random forests and Bayesian networks
- PMID: 35862421
- PMCID: PMC9302835
- DOI: 10.1371/journal.pone.0271610
Analysis of epidemiological association patterns of serum thyrotropin by combining random forests and Bayesian networks
Erratum in
-
Correction: Analysis of epidemiological association patterns of serum thyrotropin by combining random forests and Bayesian networks.PLoS One. 2023 Nov 10;18(11):e0294489. doi: 10.1371/journal.pone.0294489. eCollection 2023. PLoS One. 2023. PMID: 37948441 Free PMC article.
Abstract
Background: Approaching epidemiological data with flexible machine learning algorithms is of great value for understanding disease-specific association patterns. However, it can be difficult to correctly extract and understand those patterns due to the lack of model interpretability.
Method: We here propose a machine learning workflow that combines random forests with Bayesian network surrogate models to allow for a deeper level of interpretation of complex association patterns. We first evaluate the proposed workflow on synthetic data. We then apply it to data from the large population-based Study of Health in Pomerania (SHIP). Based on this combination, we discover and interpret broad patterns of individual serum TSH concentrations, an important marker of thyroid functionality.
Results: Evaluations using simulated data show that feature associations can be correctly recovered by combining random forests and Bayesian networks. The presented model achieves predictive accuracy that is similar to state-of-the-art models (root mean square error of 0.66, mean absolute error of 0.55, coefficient of determination of R2 = 0.15). We identify 62 relevant features from the final random forest model, ranging from general health variables over dietary and genetic factors to physiological, hematological and hemostasis parameters. The Bayesian network model is used to put these features into context and make the black-box random forest model more understandable.
Conclusion: We demonstrate that the combination of random forest and Bayesian network analysis is helpful to reveal and interpret broad association patterns of individual TSH concentrations. The discovered patterns are in line with state-of-the-art literature. They may be useful for future thyroid research and improved dosing of therapeutics.
Conflict of interest statement
The authors have declared that no competing interests exist.
Figures


Similar articles
-
Can Predictive Modeling Tools Identify Patients at High Risk of Prolonged Opioid Use After ACL Reconstruction?Clin Orthop Relat Res. 2020 Jul;478(7):0-1618. doi: 10.1097/CORR.0000000000001251. Clin Orthop Relat Res. 2020. PMID: 32282466 Free PMC article.
-
Error Tolerance of Machine Learning Algorithms across Contemporary Biological Targets.Molecules. 2019 Jun 4;24(11):2115. doi: 10.3390/molecules24112115. Molecules. 2019. PMID: 31167452 Free PMC article.
-
Genome-wide prediction of discrete traits using Bayesian regressions and machine learning.Genet Sel Evol. 2011 Feb 17;43(1):7. doi: 10.1186/1297-9686-43-7. Genet Sel Evol. 2011. PMID: 21329522 Free PMC article.
-
Machine Learning Hybrid Model for the Prediction of Chronic Kidney Disease.Comput Intell Neurosci. 2023 Mar 14;2023:9266889. doi: 10.1155/2023/9266889. eCollection 2023. Comput Intell Neurosci. 2023. PMID: 36959840 Free PMC article.
-
Bayesian Networks for Risk Prediction Using Real-World Data: A Tool for Precision Medicine.Value Health. 2019 Apr;22(4):439-445. doi: 10.1016/j.jval.2019.01.006. Epub 2019 Mar 15. Value Health. 2019. PMID: 30975395 Review.
Cited by
-
Correction: Analysis of epidemiological association patterns of serum thyrotropin by combining random forests and Bayesian networks.PLoS One. 2023 Nov 10;18(11):e0294489. doi: 10.1371/journal.pone.0294489. eCollection 2023. PLoS One. 2023. PMID: 37948441 Free PMC article.
References
-
- Breiman L. Random forests. Mach Learn. 2001;45: 5–32. doi: 10.1023/A:1010933404324 - DOI
Publication types
MeSH terms
Substances
LinkOut - more resources
Full Text Sources