Detection of primary Sjögren's syndrome in primary care: developing a classification model with the use of routine healthcare data and machine learning
- PMID: 35945489
- PMCID: PMC9361661
- DOI: 10.1186/s12875-022-01804-w
Detection of primary Sjögren's syndrome in primary care: developing a classification model with the use of routine healthcare data and machine learning
Abstract
Background: Primary Sjögren's Syndrome (pSS) is a rare autoimmune disease that is difficult to diagnose due to a variety of clinical presentations, resulting in misdiagnosis and late referral to specialists. To improve early-stage disease recognition, this study aimed to develop an algorithm to identify possible pSS patients in primary care. We built a machine learning algorithm which was based on combined healthcare data as a first step towards a clinical decision support system.
Method: Routine healthcare data, consisting of primary care electronic health records (EHRs) data and hospital claims data (HCD), were linked on patient level and consisted of 1411 pSS and 929,179 non-pSS patients. Logistic regression (LR) and random forest (RF) models were used to classify patients using age, gender, diseases and symptoms, prescriptions and GP visits.
Results: The LR and RF models had an AUC of 0.82 and 0.84, respectively. Many actual pSS patients were found (sensitivity LR = 72.3%, RF = 70.1%), specificity was 74.0% (LR) and 77.9% (RF) and the negative predictive value was 99.9% for both models. However, most patients classified as pSS patients did not have a diagnosis of pSS in secondary care (positive predictive value LR = 0.4%, RF = 0.5%).
Conclusion: This is the first study to use machine learning to classify patients with pSS in primary care using GP EHR data. Our algorithm has the potential to support the early recognition of pSS in primary care and should be validated and optimized in clinical practice. To further enhance the algorithm in detecting pSS in primary care, we suggest it is improved by working with experienced clinicians.
Keywords: Machine learning; Primary Sjögren’s syndrome; Primary care; Routine healthcare data.
© 2022. The Author(s).
Conflict of interest statement
The authors do not have any financial or non-financial conflicts of interest to declare.
Figures
Similar articles
-
Raman spectroscopy combined with machine learning algorithms for rapid detection Primary Sjögren's syndrome associated with interstitial lung disease.Photodiagnosis Photodyn Ther. 2022 Dec;40:103057. doi: 10.1016/j.pdpdt.2022.103057. Epub 2022 Aug 6. Photodiagnosis Photodyn Ther. 2022. PMID: 35944848
-
Predicting autoimmune thyroiditis in primary Sjogren's syndrome patients using a random forest classifier: a retrospective study.Arthritis Res Ther. 2025 Jan 2;27(1):1. doi: 10.1186/s13075-024-03469-5. Arthritis Res Ther. 2025. PMID: 39748261 Free PMC article.
-
Determinants of diagnosis and disease course in primary Sjögren's syndrome: Results from datamining of electronic health records.Int J Rheum Dis. 2019 Sep;22(9):1768-1774. doi: 10.1111/1756-185X.13641. Epub 2019 Jul 21. Int J Rheum Dis. 2019. PMID: 31328441
-
Primary Sjögren's syndrome.Lupus. 2018 Oct;27(1_suppl):32-35. doi: 10.1177/0961203318801673. Lupus. 2018. PMID: 30452329 Review.
-
Anti-Sjögren's-syndrome-related antigen A autoantibodies (Anti-SSA antibody) and meningoencephalitis: Sjögren's syndrome waiting to be unveiled? A case series and review of literature.Rheumatol Int. 2021 Oct;41(10):1855-1866. doi: 10.1007/s00296-020-04716-z. Epub 2020 Oct 11. Rheumatol Int. 2021. PMID: 33040168 Review.
Cited by
-
Implications of Data Extraction and Processing of Electronic Health Records for Epidemiological Research: Observational Study.J Med Internet Res. 2025 Jun 11;27:e64628. doi: 10.2196/64628. J Med Internet Res. 2025. PMID: 40498913 Free PMC article.
-
Automatically pre-screening patients for the rare disease aromatic l-amino acid decarboxylase deficiency using knowledge engineering, natural language processing, and machine learning on a large EHR population.J Am Med Inform Assoc. 2024 Feb 16;31(3):692-704. doi: 10.1093/jamia/ocad244. J Am Med Inform Assoc. 2024. PMID: 38134953 Free PMC article.
-
Integration of Artificial Intelligence into the Approach for Diagnosis and Monitoring of Dry Eye Disease.Diagnostics (Basel). 2022 Dec 14;12(12):3167. doi: 10.3390/diagnostics12123167. Diagnostics (Basel). 2022. PMID: 36553174 Free PMC article. Review.
-
Prediction of Sjögren's disease diagnosis using matched electronic dental-health record data.BMC Med Inform Decis Mak. 2024 Feb 9;24(1):43. doi: 10.1186/s12911-024-02448-9. BMC Med Inform Decis Mak. 2024. PMID: 38336735 Free PMC article.
-
Reliability of non-contact tongue diagnosis for Sjögren's syndrome using machine learning method.Sci Rep. 2023 Jan 24;13(1):1334. doi: 10.1038/s41598-023-27764-4. Sci Rep. 2023. PMID: 36693892 Free PMC article.
References
-
- Daniels T, Fox. Salivary and oral components of Sjögren’s syndrome. Rheum Dis Clin North Am. 1992:571–589. - PubMed
-
- Vivino FB. Sjogren's syndrome: Clinical aspects. Clin Immunol. 2017;182:48-54. 10.1016/j.clim.2017.04.005. - PubMed
-
- Qin B, Wang J, Yang Z, Yang M, Ma N, Huang F, Zhong R. Epidemiology of primary Sjögren's syndrome: a systematic review and meta-analysis. Ann Rheum Dis. 2015;74(11):1983-9. 10.1136/annrheumdis-2014-205375. - PubMed
Publication types
MeSH terms
LinkOut - more resources
Full Text Sources
Medical