Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2022 Aug 9;23(1):199.
doi: 10.1186/s12875-022-01804-w.

Detection of primary Sjögren's syndrome in primary care: developing a classification model with the use of routine healthcare data and machine learning

Affiliations

Detection of primary Sjögren's syndrome in primary care: developing a classification model with the use of routine healthcare data and machine learning

Jesper T Dros et al. BMC Prim Care. .

Abstract

Background: Primary Sjögren's Syndrome (pSS) is a rare autoimmune disease that is difficult to diagnose due to a variety of clinical presentations, resulting in misdiagnosis and late referral to specialists. To improve early-stage disease recognition, this study aimed to develop an algorithm to identify possible pSS patients in primary care. We built a machine learning algorithm which was based on combined healthcare data as a first step towards a clinical decision support system.

Method: Routine healthcare data, consisting of primary care electronic health records (EHRs) data and hospital claims data (HCD), were linked on patient level and consisted of 1411 pSS and 929,179 non-pSS patients. Logistic regression (LR) and random forest (RF) models were used to classify patients using age, gender, diseases and symptoms, prescriptions and GP visits.

Results: The LR and RF models had an AUC of 0.82 and 0.84, respectively. Many actual pSS patients were found (sensitivity LR = 72.3%, RF = 70.1%), specificity was 74.0% (LR) and 77.9% (RF) and the negative predictive value was 99.9% for both models. However, most patients classified as pSS patients did not have a diagnosis of pSS in secondary care (positive predictive value LR = 0.4%, RF = 0.5%).

Conclusion: This is the first study to use machine learning to classify patients with pSS in primary care using GP EHR data. Our algorithm has the potential to support the early recognition of pSS in primary care and should be validated and optimized in clinical practice. To further enhance the algorithm in detecting pSS in primary care, we suggest it is improved by working with experienced clinicians.

Keywords: Machine learning; Primary Sjögren’s syndrome; Primary care; Routine healthcare data.

PubMed Disclaimer

Conflict of interest statement

The authors do not have any financial or non-financial conflicts of interest to declare.

Figures

Fig. 1
Fig. 1
Schematic overview of the data analysis process
Fig. 2
Fig. 2
Receiver Operating Characteristic of both the logistic regression and random forest
Fig. 3
Fig. 3
Precision Recall Curve of both the logistic regression and random forest

Similar articles

Cited by

References

    1. Daniels T, Fox. Salivary and oral components of Sjögren’s syndrome. Rheum Dis Clin North Am. 1992:571–589. - PubMed
    1. Vivino FB. Sjogren's syndrome: Clinical aspects. Clin Immunol. 2017;182:48-54. 10.1016/j.clim.2017.04.005. - PubMed
    1. Qin B, Wang J, Yang Z, Yang M, Ma N, Huang F, Zhong R. Epidemiology of primary Sjögren's syndrome: a systematic review and meta-analysis. Ann Rheum Dis. 2015;74(11):1983-9. 10.1136/annrheumdis-2014-205375. - PubMed
    1. Wiegersma S, Flinterman LE, Seghieri C, et al. Fitness for purpose of routinely recorded health data to identify patients with complex diseases: the case of Sjögren’s syndrome. Learn Health Syst. 2020;4(4). 10.1002/lrh2.10242. - PMC - PubMed
    1. Ypinga JHL, de Vries NM, Boonen LHHM, et al. Effectiveness and costs of specialised physiotherapy given via ParkinsonNet: a retrospective analysis of medical claims data. Lancet Neurol. 2018;17(2):153–161. doi: 10.1016/S1474-4422(17)30406-4. - DOI - PubMed

Publication types

LinkOut - more resources