Comparative Study

. 2025 Jul 23:13:1605908.

doi: 10.3389/fpubh.2025.1605908. eCollection 2025.

Battle of the artificial intelligence: a comprehensive comparative analysis of DeepSeek and ChatGPT for urinary incontinence-related questions

Huawei Cao^#¹, Changzhen Hao^#², Tao Zhang^#³, Xiang Zheng¹, Zihao Gao¹, Jiyue Wu⁴, Lijian Gan⁴, Yu Liu³, Xiangjun Zeng⁵, Wei Wang¹

Affiliations

¹ Department of Urology, Beijing Chao-yang Hospital, Capital Medical University, Beijing, China.
² Department of Urology, Peking University International Hospital, Beijing, China.
³ Department of Pathology, School of Basic Medical Sciences, Capital Medical University, Beijing, China.
⁴ Urinary and Nephropathy Center, Beijing Chao-yang Hospital, Capital Medical University, Beijing, China.
⁵ Department of Physiology and Pathophysiology, Capital Medical University, Beijing, China.

^# Contributed equally.

PMID: 40771241
PMCID: PMC12325333
DOI: 10.3389/fpubh.2025.1605908

Comparative Study

Battle of the artificial intelligence: a comprehensive comparative analysis of DeepSeek and ChatGPT for urinary incontinence-related questions

Huawei Cao et al. Front Public Health. 2025.

. 2025 Jul 23:13:1605908.

doi: 10.3389/fpubh.2025.1605908. eCollection 2025.

Authors

Huawei Cao^#¹, Changzhen Hao^#², Tao Zhang^#³, Xiang Zheng¹, Zihao Gao¹, Jiyue Wu⁴, Lijian Gan⁴, Yu Liu³, Xiangjun Zeng⁵, Wei Wang¹

Affiliations

¹ Department of Urology, Beijing Chao-yang Hospital, Capital Medical University, Beijing, China.
² Department of Urology, Peking University International Hospital, Beijing, China.
³ Department of Pathology, School of Basic Medical Sciences, Capital Medical University, Beijing, China.
⁴ Urinary and Nephropathy Center, Beijing Chao-yang Hospital, Capital Medical University, Beijing, China.
⁵ Department of Physiology and Pathophysiology, Capital Medical University, Beijing, China.

^# Contributed equally.

PMID: 40771241
PMCID: PMC12325333
DOI: 10.3389/fpubh.2025.1605908

Abstract

Background: With the rapid advancement and widespread adoption of artificial intelligence (AI), patients increasingly turn to AI for initial medical guidance. Therefore, a comprehensive evaluation of AI-generated responses is warranted. This study aimed to compare the performance of DeepSeek and ChatGPT in answering urinary incontinence-related questions and to delineate their respective strengths and limitations.

Methods: Based on the American Urological Association/Society of Urodynamics, Female Pelvic Medicine & Urogenital Reconstruction (AUA/SUFU) and European Association of Urology (EAU) guidelines, we designed 25 urinary incontinence-related questions. Responses from DeepSeek and ChatGPT-4.0 were evaluated for reliability, quality, and readability. Fleiss' kappa was employed to calculate inter-rater reliability. For clinical case scenarios, we additionally assessed the appropriateness of responses. A comprehensive comparative analysis was performed.

Results: The modified DISCERN (mDISCERN) scores for DeepSeek and ChatGPT-4.0 were 28.24 ± 0.88 and 28.76 ± 1.56, respectively, showing no practically meaningful difference [P = 0.188, Cohen's d = 0.41 (95% CI: -0.15, 0.97)]. Both AI chatbots rarely provided source references. In terms of quality, DeepSeek achieved a higher mean Global Quality Scale (GQS) score than ChatGPT-4.0 (4.76 ± 0.52 vs. 4.32 ± 0.69, P = 0.001). DeepSeek also demonstrated superior readability, as indicated by a higher Flesch Reading Ease (FRE) score (76.43 ± 10.90 vs. 70.95 ± 11.16, P = 0.039) and a lower Simple Measure of Gobbledygook (SMOG) index (12.26 ± 1.39 vs. 14.21 ± 1.88, P < 0.001), suggesting easier comprehension. Regarding guideline adherence, DeepSeek had 11 (73.33%) fully compliant responses, while ChatGPT-4.0 had 13 (86.67%), with no significant difference [P = 0.651, Cohen's w = 0.083 (95% CI: 0.021, 0.232)].

Conclusion: DeepSeek and ChatGPT-4.0 might exhibit comparable reliability in answering urinary incontinence-related questions, though both lacked sufficient references. However, DeepSeek outperformed ChatGPT-4.0 in response quality and readability. While both AI chatbots largely adhered to clinical guidelines, occasional deviations were observed. Further refinements are necessary before the widespread clinical implementation of AI chatbots in urology.

Keywords: ChatGPT; DeepSeek; artificial intelligence; comparative analysis; urinary incontinence.

PubMed Disclaimer

Conflict of interest statement

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Figures

Flowchart illustrating the evaluation process of conceptual questions and clinical cases. Initially, 35 questions were created based on AUA/SUFU and EAU guidelines. Ten questions were excluded due to similarity, subjectivity, or grammatical inadequacy, leaving 25 questions for study. Responses from ChatGPT-4.0 and DeepSeek were evaluated by three professionals on reliability, quality, readability, and appropriateness in clinical cases. — **Figure 1**
Flow chart of question inclusion and response evaluation. AUA/SUFU, American Urological Association/Society of Urodynamics, Female Pelvic Medicine & Urogenital Reconstruction; EAU, European Association of Urology.

Bar chart comparing DeepSeek-R1 and ChatGPT-4.0 on four categories: “Cannot provide advice,” “Expression of encouragement or comfort,” “Consult with healthcare professionals,” and “Full compliance with guidelines.” DeepSeek scores are higher for “Expression of encouragement or comfort” and slightly lower for “Full compliance with guidelines” and “Consult with healthcare professionals.” Both score zero for “Cannot provide advice.” P-values are indicated beside categories; significance is noted for “Expression of encouragement or comfort” with P = 0.021. — **Figure 2**
Evaluation of the appropriateness of responses generated by two artificial intelligence chatbots in clinical scenarios.

See this image and copyright information in PMC

References

1. Aoki Y, Brown HW, Brubaker L, Cornu JN, Daly JO, Cartwright R. Urinary incontinence in women. Nat Rev Dis Primers. (2017) 3:17042. 10.1038/nrdp.2017.42 - DOI - PMC - PubMed
1. Przydacz M, Chlosta M, Chlosta P. Population-level prevalence, bother, and treatment behavior for urinary incontinence in an eastern European country: findings from the LUTS Poland study. J Clin Med. (2021) 10:2314. 10.3390/jcm10112314 - DOI - PMC - PubMed
1. AlQuaiz AM, Kazi A, AlYousefi N, Alwatban L, AlHabib Y, Turkistani I. Urinary incontinence affects the quality of life and increases psychological distress and low self-esteem. Healthcare. (2023) 11:1772. 10.3390/healthcare11121772 - DOI - PMC - PubMed
1. Wang C, Li J, Wan X, Wang X, Kane RL, Wang K. Effects of stigma on Chinese women's attitudes towards seeking treatment for urinary incontinence. J Clin Nurs. (2015) 24:1112–21. 10.1111/jocn.12729 - DOI - PubMed
1. Barakat-Johnson M, Lai M, Basjarahil S, Campbell J, Cunich M, Disher G, et al. Patients' experience of incontinence and incontinence-associated dermatitis in hospital settings: a qualitative study. J Wound Care. (2024) 33:cxcix–ccvii. 10.12968/jowc.2021.0394 - DOI - PubMed

Publication types

Actions

MeSH terms

Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions

LinkOut - more resources

Full Text Sources
- Frontiers Media SA
- PubMed Central
Medical
- MedlinePlus Health Information
Miscellaneous
- NCI CPTAC Assay Portal

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Battle of the artificial intelligence: a comprehensive comparative analysis of DeepSeek and ChatGPT for urinary incontinence-related questions

Affiliations

Battle of the artificial intelligence: a comprehensive comparative analysis of DeepSeek and ChatGPT for urinary incontinence-related questions

Authors

Affiliations

Abstract

Conflict of interest statement

Figures

References

Publication types

MeSH terms

LinkOut - more resources

Full Text Sources

Medical

Miscellaneous