Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2025 Aug 13:13:e65365.
doi: 10.2196/65365.

Assessing the Role of Large Language Models Between ChatGPT and DeepSeek in Asthma Education for Bilingual Individuals: Comparative Study

Affiliations

Assessing the Role of Large Language Models Between ChatGPT and DeepSeek in Asthma Education for Bilingual Individuals: Comparative Study

Yaxin Liu et al. JMIR Med Inform. .

Abstract

Background: Asthma is a chronic inflammatory airway disease requiring long-term management. Artificial intelligence (AI)-driven tools such as large language models (LLMs) hold potential for enhancing patient education, especially for multilingual populations. However, comparative assessments of LLMs in disease-specific, bilingual health communication are limited.

Objective: This study aimed to evaluate and compare the performance of two advanced LLMs-ChatGPT-4o (OpenAI) and DeepSeek-v3 (DeepSeek AI)-in providing bilingual (English and Chinese) education for patients with asthma, focusing on accuracy, completeness, clinical relevance, and language adaptability.

Methods: A total of 53 asthma-related questions were collected from real patient inquiries across 8 clinical domains. Each question was posed in both English and Chinese to ChatGPT-4o and DeepSeek-v3. Responses were evaluated using a 7D clinical quality framework (eg, completeness, consensus consistency, and reasoning ability) adapted from Google Health. Three respiratory clinicians performed blinded scoring evaluations. Descriptive statistics and Wilcoxon signed-rank tests were applied to compare performance across domains and against theoretical maximums.

Results: Both models demonstrated high overall quality in generating bilingual educational content. DeepSeek-v3 outperformed ChatGPT-4o in completeness and currency, particularly in treatment-related knowledge and symptom interpretation. ChatGPT-4o showed advantages in clarity and accessibility. In English responses, ChatGPT achieved perfect scores across 5 domains, but scored lower in clinical features (mean 3.78, SD 0.16; P=.02), treatment (mean 3.90, SD 0.05; P=.03), and differential diagnosis (mean 3.83, SD 0.29; P=.08).

Conclusions: ChatGPT-4o and DeepSeek-v3 each offer distinct strengths for bilingual asthma education. While ChatGPT is more suitable for general health education due to its expressive clarity, DeepSeek provides more up-to-date and comprehensive clinical content. Both models can serve as effective supplementary tools for patient self-management but cannot replace professional medical advice. Future AI health care systems should enhance clinical reasoning, ensure guideline currency, and integrate human oversight to optimize safety and accuracy.

Keywords: patient education; ChatGPT; DeepSeek; asthma; cross-linguistic study.

PubMed Disclaimer

Conflict of interest statement

Conflicts of Interest: None declared.

Figures

Figure 1.
Figure 1.. Response examples of the two large language models in Chinese and English language contexts.
Figure 2.
Figure 2.. Performance heatmap across seven evaluation dimensions (1‐4 scale). CN: Chinese; EN: English.
Figure 3.
Figure 3.. Quality assessment of ChatGPT’s responses to asthma-related questions across 8 domains in the English language environment. *P<.05; NS: not significant (P≥.05).

Similar articles

References

    1. Dharmage SC, Perret JL, Custovic A. Epidemiology of asthma in children and adults. Front Pediatr. 2019;7:246. doi: 10.3389/fped.2019.00246. doi. Medline. - DOI - PMC - PubMed
    1. Oren O, Gersh BJ, Bhatt DL. Artificial intelligence in medical imaging: switching from radiographic pathological data to clinically meaningful endpoints. Lancet Digit Health. 2020 Sep;2(9):e486–e488. doi: 10.1016/S2589-7500(20)30160-6. doi. Medline. - DOI - PubMed
    1. Kreimeyer K, Foster M, Pandey A, et al. Natural language processing systems for capturing and standardizing unstructured clinical information: a systematic review. J Biomed Inform. 2017 Sep;73:14–29. doi: 10.1016/j.jbi.2017.07.012. doi. Medline. - DOI - PMC - PubMed
    1. Nadkarni PM, Ohno-Machado L, Chapman WW. Natural language processing: an introduction. J Am Med Inform Assoc. 2011;18(5):544–551. doi: 10.1136/amiajnl-2011-000464. doi. Medline. - DOI - PMC - PubMed
    1. Yu G, Li Z, Li S, et al. The role of artificial intelligence in identifying asthma in pediatric inpatient setting. Ann Transl Med. 2020 Nov;8(21):1367. doi: 10.21037/atm-20-2501a. doi. Medline. - DOI - PMC - PubMed

LinkOut - more resources