Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2025 Nov 27;11(4):e005773.
doi: 10.1136/rmdopen-2025-005773.

Identification of individuals at high risk of developing rheumatoid arthritis: a balanced random forest model in a cohort of 1544 first-degree relatives

Affiliations

Identification of individuals at high risk of developing rheumatoid arthritis: a balanced random forest model in a cohort of 1544 first-degree relatives

Romain Aymon et al. RMD Open. .

Abstract

Objectives: To identify in a genetically susceptible population individuals at higher risk of developing rheumatoid arthritis (RA) using a classification approach combining known epidemiological risk factors, serological biomarkers, genetics, clinical signs and symptoms.

Methods: We used data from the prospective SCREEN-RA (Evaluation of a SCREENing strategy for Rheumatoid Arthritis) cohort of 1540 first-degree relatives of RA patients (RA-FDRs). The primary outcome was the development of RA. Additionally, we used seropositive inflammatory arthritis (IA) as a secondary outcome for exploratory analyses. Balanced random forest (BRF) models were fit and evaluated through fivefold cross-validation to avoid overfitting. We chose a classification threshold that targeted high sensitivity.

Results: After a mean follow-up of 7.1 years, 27 participants developed RA and 126 developed seropositive IA. The BRF demonstrated moderate predictive performance, characterised by high sensitivity (≥0.85) but modest specificity. Rheumatoid factors (RFs) had the highest importance in RA prediction, followed by symptoms of 'clinically suspected arthralgia' (CSA) scale. Age, gender and anti-RA33 autoantibodies were the main variables for the prediction of seropositive IA.

Conclusions: Overall, the results demonstrate that predicting RA by combining genetics, serological biomarkers, epidemiological risk factors and clinical signs is promising, although model generalisation remains challenging. The low prevalence of RA in the cohort complicates the development of highly accurate prediction models. Future efforts should focus on including external validation and potentially incorporating additional biomarkers to enhance the sensitivity and overall performance of the predictive tests.

Keywords: Arthritis, Rheumatoid; Biomarkers; Epidemiology; Machine Learning; Sensitivity and Specificity.

PubMed Disclaimer

Conflict of interest statement

Competing interests: RA, none declared. CL, none declared. BG has received speaker fees from Lilly, outside the submitted work. MG, IG and SS are employees of Thermo Fisher Scientific—Phadia GmbH. OS, none declared. ZS, none declared. RG, none declared. DS, none declared. JD, none with the submitted work. BM, none declared, DD has received speaker’s fees from Eli Lilly, Novartis, UCB, GSK, Menarini, Viatris, for attending meetings from Abbvie, UCB, Janssen and for participation on an advisory board from Novartis, all outside the submitted work. LB, none declared. IvM has received support for attending meetings and/or travel from Novartis, Abbvie, Pfizer and UCB, outside the submitted work. DK has received consulting/speaker’s fees from Abbvie, Pfizer, Eli Lilly, Sanofi, UCB and Novartis, outside the submitted work. ARR has received consulting fees from Abbvie, Gilead, Lilly and BMS, speaker’s fees from Abbvie, Pfizer, Sanofi, UCB, BMS, Lilly, Gilead and Roche, and payment for expert testimony__ from Abbvie and Gilead, all outside the submitted work. AC, none declared. RM, none declared. DSC, none declared. AF has received grants or contracts (Eli Lilly, Pfizer, AbbVie, Gilead and BMS), consulting fees (AstraZeneca, AbbVie, Pfizer and Gilead) and honorary payments (BMIS, AbbVie, Eli Lilly, Pfizer and MSD) and participated in advisory boards (AstraZeneca, Gilead, Novartis, AbbVie, Eli Lilly, Pfizer, J&J, Mylan and UCB).

Figures

Figure 1
Figure 1. Variable importance for outcomes (i) rheumatoid arthritis and (ii) seropositive inflammatory arthritis, 6 to 18 months before the reference date. Diabetes = type I & type II. Family base multi = more than one first-degree relative has RA and/or another autoimmune disease. Random = random variable from Bernoulli distribution with probability 0.5. ACPA, anti-citrullinated protein autoantibody; BMI, Body Mass Index; CSA, clinically suspected arthralgia; RA, rheumatoid arthritis; RA33, anti-RA33 autoantibodies; RF, rheumatoid factor; SE, shared epitope; UPA, pack-year.

References

    1. Burgers LE, Raza K, van der Helm - van Mil AH. Window of opportunity in rheumatoid arthritis – definitions and supporting evidence: from old to new perspectives. RMD Open. 2019;5:e000870. doi: 10.1136/rmdopen-2018-000870. - DOI - PMC - PubMed
    1. Alpizar-Rodriguez D, Finckh A. Is the prevention of rheumatoid arthritis possible? Clin Rheumatol. 2020;39:1383–9. doi: 10.1007/s10067-020-04927-6. - DOI - PubMed
    1. Deane KD, Demoruelle MK, Kelmenson LB, et al. Genetic and environmental risk factors for rheumatoid arthritis. Best Pract Res Clin Rheumatol. 2017;31:3–18. doi: 10.1016/j.berh.2017.08.003. - DOI - PMC - PubMed
    1. Ajeganova S, Huizinga TWJ. Seronegative and seropositive RA: alike but different? Nat Rev Rheumatol. 2015;11:8–9. doi: 10.1038/nrrheum.2014.194. - DOI - PubMed
    1. Petrovská N, Prajzlerová K, Vencovský J, et al. The pre-clinical phase of rheumatoid arthritis: From risk factors to prevention of arthritis. Autoimmun Rev. 2021;20:102797. doi: 10.1016/j.autrev.2021.102797. - DOI - PubMed