Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2025 Jun 7;15(12):1455.
doi: 10.3390/diagnostics15121455.

Evaluation of the Performance of Large Language Models in the Management of Axial Spondyloarthropathy: Analysis of EULAR 2022 Recommendations

Affiliations

Evaluation of the Performance of Large Language Models in the Management of Axial Spondyloarthropathy: Analysis of EULAR 2022 Recommendations

Ahmet Usen et al. Diagnostics (Basel). .

Abstract

Introduction: Guidelines have great importance in revealing complex and chronic conditions such as axial spondyloarthropathy. The aim of this study is to compare the answers given by various large language models to open-ended questions created from ASAS-EULAR 2022 guidance. Materials and Methods: This was a cross-sectional and comparative study. A total of 15 recommendations in the ASAS-EULAR 2022 guideline were derived directly from their content into open-ended questions in a clinical context. Each question was asked to the ChatGPT-3.5, GPT-4o, and Gemini 2.0 Flash models, and the answers were evaluated with a seven-point Likert system in terms of usability, reliability, Flesch-Kincaid Reading Ease (FKRE) and Flesch-Kincaid Grade Level (FKGL) metrics for readability, Universal Sentence Encoder (USE) and ROUGE-L for semantic and surface-level similarity. The results of different large language models were statistically compared, and p < 0.05 was revealed as statistically significant. Results: Better FKRE and FKGL scores were obtained in the Google Gemini 2.0 program (p < 0.05). Reliability and usefulness scores were significantly higher for ChatGPT-4o and Gemini 2.0 (p < 0.05). ChatGPT-4o yielded significantly higher semantic similarity scores compared to ChatGPT-3.5 (p < 0.05). There was a negative correlation between FKRE and FKGL scores and a positive correlation between reliability and usefulness scores (p < 0.05). Conclusions: It was determined that ChatGPT-4o and Gemini 2.0 programs gave more reliable, useful, and readable answers to open-ended questions derived from the ASAS-EULAR 2022 guidelines. These programs may potentially assist in supporting treatment decision-making in rheumatology in the future.

Keywords: artificial intelligence; guideline; rheumatic disease; spondylarthritis.

PubMed Disclaimer

Conflict of interest statement

The authors declare that they have no conflicts of interest.

Similar articles

References

    1. Smith J.A. Update on ankylosing spondylitis: Current concepts in pathogenesis. Curr. Allergy Asthma Rep. 2015;15:489. doi: 10.1007/s11882-014-0489-6. - DOI - PubMed
    1. Walsh J.A., Magrey M. Clinical Manifestations and Diagnosis of Axial Spondyloarthritis. J. Clin. Rheumatol. 2021;27:e547–e560. doi: 10.1097/RHU.0000000000001575. - DOI - PMC - PubMed
    1. Fattorini F., Gentileschi S., Cigolini C., Terenzi R., Pata A.P., Esti L., Carli L. Axial spondyloarthritis: One year in review 2023. Clin. Exp. Rheumatol. 2023;41:2142–2150. doi: 10.55563/clinexprheumatol/9fhz98. - DOI - PubMed
    1. Ramiro S., Nikiphorou E., Sepriano A., Ortolan A., Webers C., Baraliakos X., Landewé R.B.M., Van den Bosch F.E., Boteva B., Bremander A., et al. ASAS-EULAR recommendations for the management of axial spondyloarthritis: 2022 update. Ann. Rheum. Dis. 2023;82:19–34. doi: 10.1136/ard-2022-223296. - DOI - PubMed
    1. Ortolan A., Webers C., Sepriano A., Falzon L., Baraliakos X., Landewé R.B., Ramiro S., van der Heijde D., Nikiphorou E. Efficacy and safety of non-pharmacological and non-biological interventions: A systematic literature review informing the 2022 update of the ASAS/EULAR recommendations for the management of axial spondyloarthritis. Ann. Rheum. Dis. 2023;82:142–152. doi: 10.1136/ard-2022-223297. - DOI - PubMed

LinkOut - more resources