Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Observational Study
. 2024 Jan 30;10(1):e003353.
doi: 10.1136/rmdopen-2023-003353.

Prevalence and clinical characteristics of patients with rheumatoid arthritis with interstitial lung disease using unstructured healthcare data and machine learning

Affiliations
Observational Study

Prevalence and clinical characteristics of patients with rheumatoid arthritis with interstitial lung disease using unstructured healthcare data and machine learning

Jose A Román Ivorra et al. RMD Open. .

Abstract

Objectives: Real-world data regarding rheumatoid arthritis (RA) and its association with interstitial lung disease (ILD) is still scarce. This study aimed to estimate the prevalence of RA and ILD in patients with RA (RAILD) in Spain, and to compare clinical characteristics of patients with RA with and without ILD using natural language processing (NLP) on electronic health records (EHR).

Methods: Observational case-control, retrospective and multicentre study based on the secondary use of unstructured clinical data from patients with adult RA and RAILD from nine hospitals between 2014 and 2019. NLP was used to extract unstructured clinical information from EHR and standardise it into a SNOMED-CT terminology. Prevalence of RA and RAILD were calculated, and a descriptive analysis was performed. Characteristics between patients with RAILD and RA patients without ILD (RAnonILD) were compared.

Results: From a source population of 3 176 165 patients and 64 241 683 EHRs, 13 958 patients with RA were identified. Of those, 5.1% patients additionally had ILD (RAILD). The overall age-adjusted prevalence of RA and RAILD were 0.53% and 0.02%, respectively. The most common ILD subtype was usual interstitial pneumonia (29.3%). When comparing RAILD versus RAnonILD patients, RAILD patients were older and had more comorbidities, notably concerning infections (33.6% vs 16.5%, p<0.001), malignancies (15.9% vs 8.5%, p<0.001) and cardiovascular disease (25.8% vs 13.9%, p<0.001) than RAnonILD. RAILD patients also had higher inflammatory burden reflected in more pharmacological prescriptions and higher inflammatory parameters and presented a higher in-hospital mortality with a higher risk of death (HR 2.32; 95% CI 1.59 to 2.81, p<0.001).

Conclusions: We found an estimated age-adjusted prevalence of RA and RAILD by analysing real-world data through NLP. RAILD patients were more vulnerable at the time of inclusion with higher comorbidity and inflammatory burden than RAnonILD, which correlated with higher mortality.

Keywords: Arthritis, Rheumatoid; Epidemiology; Pulmonary Fibrosis.

PubMed Disclaimer

Conflict of interest statement

Competing interests: JARI has received support for research grants or contracts from BMS. ET-A has received payment honoraria for lectures, presentations, speakers bureaus, manuscript writing or educational events from BMS, and support for attending meetings from Boehringer Ingelheim, Nordic Pharma and Merck Sharp & Dohme. JF-M has received payment honoraria for lectures, presentations, speakers bureaus, manuscript writing or educational events from BMS, Amgen, Lilly, Galápagos, Boehringer Ingelheim, Novartis and also support for attending meetings from Boehringer Ingelheim. LS-F has received payment honoraria for lectures, presentations, speakers bureaus, manuscript writing or educational events from Novartis, BMS, Lilly, support for attending meetings from Pfizer, Lilly and Novartis and also has participated on Data Safety Monitoring Board or Advisory Board for Novartis, Merck Sharp & Dohme and Sanofi. BS has received consulting fees from Boehringer Ingelheim, payment honoraria for lectures, presentations, speakers bureaus, manuscript writing or educational events from Roche and Boehriner Ingelheim, support for attending meetings from Boehringer Ingelheim, Roche and AstraZeneca. JLA has received consulting fees from AbbVie, Amgen, AstraZeneca, Biogen, Cellgene, Celltion, Fresenius-Kabi, Galápagos, Gebro, GSK, Merck Sharp & Dohme, Pfizer, Regeneron, UCB, payment honoraria for lectures, presentations, speakers bureaus, manuscript writing or educational events from AbbVie, Antares, Biogen, GSK, Janssen, Merck Sharp & Dohme, Lilly, Nordic, Novartis, Sanofi, UCB and also support for attendings from AbbVie, Gebro, Janssen, Merck Sharp & Dohme, Nordic, Novartis and Pfizer. EB has received payment honoraria for educational events from Boehringer Ingelheim, support for attending meetings and/or travel from Boehringer Ingelheim and Janssen. DB currently works at Savana Research, has received grants or contracts from Novartis, payment honoraria for lectures, presentations, speakers bureaus, manuscript writing or educational events from Janssen and AbbVie, support for attending meetings from UCB, Novartis and AbbVie. DV works at BMS Company and own stocks as BMS employee. SLV works at BMS Company and own stocks as BMS employee. RACM works at BMS Company. The present manuscript was fully supported by Bristol-Myers Squibb.

Figures

Figure 1
Figure 1
Study design. EHRead technology is a system based on natural language processing (NLP) that applies machine learning (ML) and deep learning to extract, analyse and interpret the free-text information written in millions of de-identified electronic health records (EHRs). The unstructured (and structured, if available), free-text information from EHRs from multiple participating sites was organised in study databases. Specific inclusion and exclusion criteria are specified to define the target population. The variables extracted from the database at different time points were organised and analysed to address multiple clinical questions. ILD, interstitial lung disease; RA, rheumatoid arthritis.
Figure 2
Figure 2
Age-adjusted prevalence for RA (A) and RAILD (B). ILD, interstitial lung disease; RA, rheumatoid arthritis.
Figure 3
Figure 3
Frequency of comorbidities at the time of inclusion. COPD, chronic obstructive pulmonary disease; ILD, interstitial lung disease; RA, rheumatoid arthritis; TIA, transient ischaemic attack. *Statistical differences between RAILD and RAnonILD. Differences were considered significant when p<0.05 in two-tailed tests.
Figure 4
Figure 4
In-hospital mortality. Kaplan-Meier survival curves of RAILD and RAnonILD patients. Patients with no reported death were censored at the time of their last EHR available. EHRs, electronic health records; ILD, interstitial lung disease; RA, rheumatoid arthritis

References

    1. Smolen JS, Aletaha D, Barton A, et al. . Rheumatoid arthritis. Nat Rev Dis Primers 2018;4:18001. 10.1038/nrdp.2018.1 - DOI - PubMed
    1. Almutairi K, Nossent J, Preen D, et al. . The global prevalence of rheumatoid arthritis: a meta-analysis based on a systematic review. Rheumatol Int 2021;41:863–77. 10.1007/s00296-020-04731-0 - DOI - PubMed
    1. Bongartz T, Nannini C, Medina-Velasquez YF, et al. . Incidence and mortality of interstitial lung disease in rheumatoid arthritis: a population-based study. Arthritis Rheum 2010;62:1583–91. 10.1002/art.27405 - DOI - PMC - PubMed
    1. Hyldgaard C, Hilberg O, Pedersen AB, et al. . A population-based cohort study of rheumatoid arthritis-associated interstitial lung disease: comorbidity and mortality. Ann Rheum Dis 2017;76:1700–6. 10.1136/annrheumdis-2017-211138 - DOI - PubMed
    1. Jacob J, Hirani N, van Moorsel CHM, et al. . Predicting outcomes in rheumatoid arthritis related interstitial lung disease. Eur Respir J 2019;53:1800869. 10.1183/13993003.00869-2018 - DOI - PMC - PubMed

Publication types

MeSH terms