Chinese generative AI models (DeepSeek and Qwen) rival ChatGPT-4 in ophthalmology queries with excellent performance in Arabic and English
- PMID: 40352182
- PMCID: PMC12059827
- DOI: 10.52225/narra.v5i1.2371
Chinese generative AI models (DeepSeek and Qwen) rival ChatGPT-4 in ophthalmology queries with excellent performance in Arabic and English
Abstract
The rapid evolution of generative artificial intelligence (genAI) has ushered in a new era of digital medical consultations, with patients turning to AI-driven tools for guidance. The emergence of Chinese-developed genAI models such as DeepSeek-R1 and Qwen-2.5 presented a challenge to the dominance of OpenAI's ChatGPT. The aim of this study was to benchmark the performance of Chinese genAI models against ChatGPT-40 and to assess disparities in performance across English and Arabic. Following the METRICS checklist for genAI evaluation, Qwen-2.5, DeepSeek-R1, and ChatGPT-40 were assessed for completeness, accuracy, and relevance using the CLEAR tool in common patient ophthalmology queries. In English, Qwen-2.5 demonstrated the highest overall performance (CLEAR score: 4.43 ± 0.28), outperforming both DeepSeek-R1 (4.3 ± 0.43) and ChatGPT-40 (4.14 ± 0.41), with p = 0.002. A similar hierarchy emerged in Arabic, with Qwen-2.5 again leading (4.40 ± 0.29), followed by DeepSeek-R1 (4.20 ± 0.49) and ChatGPT-40 (4.14 ± 0.41), with p = 0.007. Each tested genAI model exhibited near-identical performance across the two languages, with ChatGPT-40 demonstrating the most balanced linguistic capabilities (p = 0.957), while Qwen-2.5 and DeepSeek-R1 showed a marginal superiority for English. An in-depth examination of genAI performance across key CLEAR components revealed that Qwen-2.5 consistently excelled in content completeness, factual accuracy, and relevance in both English and Arabic, setting a new benchmark for genAI in medical inquiries. Despite minor linguistic disparities, all three models exhibited robust multilingual capabilities, challenging the long-held assumption that genAI is inherently biased toward English. These findings highlight the evolving nature of AI-driven medical assistance, with Chinese genAI models being able to rival or even surpass ChatGPT-40 in ophthalmology-related queries.
Keywords: DeepSeek; LLM; OpenAI; Qwen; eye disease.
© 2025 The Author(s).
Conflict of interest statement
All the authors declare that there are no conflicts of interest.
Figures
Similar articles
-
Performance of DeepSeek, Qwen 2.5 MAX, and ChatGPT Assisting in Diagnosis of Corneal Eye Diseases, Glaucoma, and Neuro-Ophthalmology Diseases Based on Clinical Case Reports.medRxiv [Preprint]. 2025 Mar 17:2025.03.14.25323836. doi: 10.1101/2025.03.14.25323836. medRxiv. 2025. PMID: 40166547 Free PMC article. Preprint.
-
Language discrepancies in the performance of generative artificial intelligence models: an examination of infectious disease queries in English and Arabic.BMC Infect Dis. 2024 Aug 8;24(1):799. doi: 10.1186/s12879-024-09725-y. BMC Infect Dis. 2024. PMID: 39118057 Free PMC article.
-
Performance of DeepSeek-R1 and ChatGPT-4o on the Chinese National Medical Licensing Examination: A Comparative Study.J Med Syst. 2025 Jun 3;49(1):74. doi: 10.1007/s10916-025-02213-z. J Med Syst. 2025. PMID: 40459679
-
Cybersecurity in the generative artificial intelligence era.Asia Pac J Ophthalmol (Phila). 2024 Jul-Aug;13(4):100091. doi: 10.1016/j.apjo.2024.100091. Epub 2024 Aug 28. Asia Pac J Ophthalmol (Phila). 2024. PMID: 39209217 Review.
-
Evaluating Generative AI in Mental Health: Systematic Review of Capabilities and Limitations.JMIR Ment Health. 2025 May 15;12:e70014. doi: 10.2196/70014. JMIR Ment Health. 2025. PMID: 40373033 Free PMC article.
References
-
- The British Broadcasting Corporation (BBC) . AI named word of the year by Collins Dictionary. Available from: https://www.bbc.com/news/entertainment-arts-67271252. Accessed: 27 February 2025.
-
- Mbizo T, Oosterwyk G, Tsibolane P, et al. . Cautious optimism: The influence of generative AI tools in software development projects. In: Gerber A, editor. South African computer science and information systems research trends. Cham: Springer Nature Switzerland; 2024.
-
- Yusuf A, Pervin N, Roman-Gonzalez M. Generative AI and the future of higher education: A threat to academic integrity or reformation? Evidence from multicultural perspectives. Int J Educ Technol High Educ 2024;21(1):21.
-
- Cohen J, Lee G, Greenbaum L, et al. . The generative world order: AI, geopolitics, and power. Goldman Sachs 2023. Available from: https://www.goldmansachs.com/insights/articles/the-generative-world-orde... power. Accessed: 27 February 2025.
MeSH terms
Supplementary concepts
LinkOut - more resources
Full Text Sources