Machine learning-based prediction of rheumatoid arthritis with development of ACPA autoantibodies in the presence of non-HLA genes polymorphisms
- PMID: 38517871
- PMCID: PMC10959370
- DOI: 10.1371/journal.pone.0300717
Machine learning-based prediction of rheumatoid arthritis with development of ACPA autoantibodies in the presence of non-HLA genes polymorphisms
Abstract
Machine learning (ML) algorithms can handle complex genomic data and identify predictive patterns that may not be apparent through traditional statistical methods. They become popular tools for medical applications including prediction, diagnosis or treatment of complex diseases like rheumatoid arthritis (RA). RA is an autoimmune disease in which genetic factors play a major role. Among the most important genetic factors predisposing to the development of this disease and serving as genetic markers are HLA-DRB and non-HLA genes single nucleotide polymorphisms (SNPs). Another marker of RA is the presence of anticitrullinated peptide antibodies (ACPA) which is correlated with severity of RA. We use genetic data of SNPs in four non-HLA genes (PTPN22, STAT4, TRAF1, CD40 and PADI4) to predict the occurrence of ACPA positive RA in the Polish population. This work is a comprehensive comparative analysis, wherein we assess and juxtapose various ML classifiers. Our evaluation encompasses a range of models, including logistic regression, k-nearest neighbors, naïve Bayes, decision tree, boosted trees, multilayer perceptron, and support vector machines. The top-performing models demonstrated closely matched levels of accuracy, each distinguished by its particular strengths. Among these, we highly recommend the use of a decision tree as the foremost choice, given its exceptional performance and interpretability. The sensitivity and specificity of the ML models is about 70% that are satisfying. In addition, we introduce a novel feature importance estimation method characterized by its transparent interpretability and global optimality. This method allows us to thoroughly explore all conceivable combinations of polymorphisms, enabling us to pinpoint those possessing the highest predictive power. Taken together, these findings suggest that non-HLA SNPs allow to determine the group of individuals more prone to develop RA rheumatoid arthritis and further implement more precise preventive approach.
Copyright: © 2024 Dudek et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
Conflict of interest statement
The authors have declared that no competing interests exist.
Figures





Similar articles
-
PADI4 Polymorphisms Confer Risk of Anti-CCP-Positive Rheumatoid Arthritis in Synergy With HLA-DRB1*04 and Smoking.Front Immunol. 2021 Oct 18;12:707690. doi: 10.3389/fimmu.2021.707690. eCollection 2021. Front Immunol. 2021. PMID: 34733271 Free PMC article.
-
Non-HLA genes PTPN22, CDK6 and PADI4 are associated with specific autoantibodies in HLA-defined subgroups of rheumatoid arthritis.Arthritis Res Ther. 2014 Aug 20;16(4):414. doi: 10.1186/s13075-014-0414-3. Arthritis Res Ther. 2014. PMID: 25138370 Free PMC article.
-
SNP in PTPN22, PADI4, and STAT4 but Not TRAF1 and CD40 Increase the Risk of Rheumatoid Arthritis in Polish Population.Int J Mol Sci. 2023 Apr 20;24(8):7586. doi: 10.3390/ijms24087586. Int J Mol Sci. 2023. PMID: 37108746 Free PMC article.
-
[Progress of molecular genetics research on rheumatoid arthritis].Zhonghua Yi Xue Yi Chuan Xue Za Zhi. 2015 Oct;32(5):728-33. doi: 10.3760/cma.j.issn.1003-9406.2015.05.026. Zhonghua Yi Xue Yi Chuan Xue Za Zhi. 2015. PMID: 26419001 Review. Chinese.
-
Genetics of rheumatoid arthritis - a comprehensive review.Clin Rev Allergy Immunol. 2013 Oct;45(2):170-9. doi: 10.1007/s12016-012-8346-7. Clin Rev Allergy Immunol. 2013. PMID: 23288628 Free PMC article. Review.
Cited by
-
Current application, possibilities, and challenges of artificial intelligence in the management of rheumatoid arthritis, axial spondyloarthritis, and psoriatic arthritis.Ther Adv Musculoskelet Dis. 2025 Jun 21;17:1759720X251343579. doi: 10.1177/1759720X251343579. eCollection 2025. Ther Adv Musculoskelet Dis. 2025. PMID: 40547599 Free PMC article. Review.
-
Explainable Boosting Machines Identify Key Metabolomic Biomarkers in Rheumatoid Arthritis.Medicina (Kaunas). 2025 Apr 30;61(5):833. doi: 10.3390/medicina61050833. Medicina (Kaunas). 2025. PMID: 40428791 Free PMC article.
References
-
- Klareskog L., Stolt P., Lundberg K., Källberg H., Bengtsson C., Grunewald J. et al.. A new model for an etiology of rheumatoid arthritis: smoking may trigger HLA-DR (shared epitope)-restricted immune reactions to autoantigens modified by citrullination. Arthritis Rheumatology 2006, 54(1), 38–46. doi: 10.1002/art.21575 - DOI - PubMed
-
- Syversen S.W., Gaarder P.I., Goll G.L., Ødegård S., Haavardsholm E.A., Mowinckel P., et al.. A new model for an etiology of rheumatoid arthritis: smoking may trigger HLA-DR (shared epitope)-restricted immune reactions to autoantigens modified by citrullination. Arthritis Rheumatology 2006, 54(1), 38–46. doi: 10.1002/art.21575 - DOI - PubMed
-
- Plenge R.M., Padyukov L., Remmers E.F., Purcell S., Lee A.T., Karlson E.W., et al.. Replication of putative candidate-gene associations with rheumatoid arthritis in >4,000 samples from North America and Sweden: association of susceptibility with PTPN22, CTLA4, and PADI4. Am J Hum Genet. 2005, 77(6), 1044–60. doi: 10.1086/498651 - DOI - PMC - PubMed
MeSH terms
Substances
LinkOut - more resources
Full Text Sources
Medical
Research Materials
Miscellaneous