Large Language Models for Thematic Summarization in Qualitative Health Care Research: Comparative Analysis of Model and Human Performance
- PMID: 40611510
- PMCID: PMC12231516
- DOI: 10.2196/64447
Large Language Models for Thematic Summarization in Qualitative Health Care Research: Comparative Analysis of Model and Human Performance
Abstract
Background: The application of large language models (LLMs) in analyzing expert textual online data is a topic of growing importance in computational linguistics and qualitative research within health care settings.
Objective: The objective of this study was to understand how LLMs can help analyze expert textual data. Topic modeling enables scaling the thematic analysis of content of a large corpus of data, but it still requires interpretation. We investigate the use of LLMs to help researchers scale this interpretation.
Methods: The primary methodological phases of this project were (1) collecting data representing posts to an online nurse forum, as well as cleaning and preprocessing the data; (2) using latent Dirichlet allocation (LDA) to derive topics; (3) using human categorization for topic modeling; and (4) using LLMs to complement and scale the interpretation of thematic analysis. The purpose is to compare the outcomes of human interpretation with those derived from LLMs.
Results: There is substantial agreement (247/310, 80%) between LLM and human interpretation. For two-thirds of the topics, human evaluation and LLMs agree on alignment and convergence of themes. Furthermore, LLM subthemes offer depth of analysis within LDA topics, providing detailed explanations that align with and build upon established human themes. Nonetheless, LLMs identify coherence and complementarity where human evaluation does not.
Conclusions: LLMs enable the automation of the interpretation task in qualitative research. There are challenges in the use of LLMs for evaluation of the resulting themes.
Keywords: ChatGPT; artificial intelligence; generative AI; health care; large language models; machine learning.
© Arturo Castellanos, Haoqiang Jiang, Paulo Gomes, Debra Vander Meer, Alfred Castillo. Originally published in JMIR AI (https://ai.jmir.org).
Conflict of interest statement
Similar articles
-
LLMs for thematic summarization in qualitative healthcare research: feasibility and insights.JMIR AI. 2025 Feb 27. doi: 10.2196/64447. Online ahead of print. JMIR AI. 2025. PMID: 40063266
-
A dataset and benchmark for hospital course summarization with adapted large language models.J Am Med Inform Assoc. 2025 Mar 1;32(3):470-479. doi: 10.1093/jamia/ocae312. J Am Med Inform Assoc. 2025. PMID: 39786555
-
Using Generative Artificial Intelligence in Health Economics and Outcomes Research: A Primer on Techniques and Breakthroughs.Pharmacoecon Open. 2025 Jul;9(4):501-517. doi: 10.1007/s41669-025-00580-4. Epub 2025 Apr 29. Pharmacoecon Open. 2025. PMID: 40301283 Free PMC article.
-
Cost-effectiveness of using prognostic information to select women with breast cancer for adjuvant systemic therapy.Health Technol Assess. 2006 Sep;10(34):iii-iv, ix-xi, 1-204. doi: 10.3310/hta10340. Health Technol Assess. 2006. PMID: 16959170
-
Health professionals' experience of teamwork education in acute hospital settings: a systematic review of qualitative literature.JBI Database System Rev Implement Rep. 2016 Apr;14(4):96-137. doi: 10.11124/JBISRIR-2016-1843. JBI Database System Rev Implement Rep. 2016. PMID: 27532314
References
-
- Ranade-Kharkar P, Weir C, Norlin C, et al. Information needs of physicians, care coordinators, and families to support care coordination of children and youth with special health care needs (CYSHCN) J Am Med Inform Assoc. 2017 Sep 1;24(5):933–941. doi: 10.1093/jamia/ocx023. doi. Medline. - DOI - PMC - PubMed
LinkOut - more resources
Full Text Sources