Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2025 Apr 4:4:e64447.
doi: 10.2196/64447.

Large Language Models for Thematic Summarization in Qualitative Health Care Research: Comparative Analysis of Model and Human Performance

Affiliations

Large Language Models for Thematic Summarization in Qualitative Health Care Research: Comparative Analysis of Model and Human Performance

Arturo Castellanos et al. JMIR AI. .

Abstract

Background: The application of large language models (LLMs) in analyzing expert textual online data is a topic of growing importance in computational linguistics and qualitative research within health care settings.

Objective: The objective of this study was to understand how LLMs can help analyze expert textual data. Topic modeling enables scaling the thematic analysis of content of a large corpus of data, but it still requires interpretation. We investigate the use of LLMs to help researchers scale this interpretation.

Methods: The primary methodological phases of this project were (1) collecting data representing posts to an online nurse forum, as well as cleaning and preprocessing the data; (2) using latent Dirichlet allocation (LDA) to derive topics; (3) using human categorization for topic modeling; and (4) using LLMs to complement and scale the interpretation of thematic analysis. The purpose is to compare the outcomes of human interpretation with those derived from LLMs.

Results: There is substantial agreement (247/310, 80%) between LLM and human interpretation. For two-thirds of the topics, human evaluation and LLMs agree on alignment and convergence of themes. Furthermore, LLM subthemes offer depth of analysis within LDA topics, providing detailed explanations that align with and build upon established human themes. Nonetheless, LLMs identify coherence and complementarity where human evaluation does not.

Conclusions: LLMs enable the automation of the interpretation task in qualitative research. There are challenges in the use of LLMs for evaluation of the resulting themes.

Keywords: ChatGPT; artificial intelligence; generative AI; health care; large language models; machine learning.

PubMed Disclaimer

Conflict of interest statement

Conflicts of Interest: None declared.

Similar articles

References

    1. Hussain MI, Figueiredo MC, Tran BD, et al. A scoping review of qualitative research in JAMIA: past contributions and opportunities for future work. J Am Med Inform Assoc. 2021 Feb 15;28(2):402–413. doi: 10.1093/jamia/ocaa179. doi. Medline. - DOI - PMC - PubMed
    1. Ranade-Kharkar P, Weir C, Norlin C, et al. Information needs of physicians, care coordinators, and families to support care coordination of children and youth with special health care needs (CYSHCN) J Am Med Inform Assoc. 2017 Sep 1;24(5):933–941. doi: 10.1093/jamia/ocx023. doi. Medline. - DOI - PMC - PubMed
    1. Scharp D, Hobensack M, Davoudi A, Topaz M. Natural language processing applied to clinical documentation in post-acute care settings: a scoping review. J Am Med Dir Assoc. 2024 Jan;25(1):69–83. doi: 10.1016/j.jamda.2023.09.006. doi. Medline. - DOI - PMC - PubMed
    1. Jiang H, Castellanos A, Castillo A, Gomes PJ, Li J, VanderMeer D. Nurses’ work concerns and disenchantment during the COVID-19 pandemic: machine learning analysis of web-based discussions. JMIR Nurs. 2023 Feb 6;6:e40676. doi: 10.2196/40676. doi. Medline. - DOI - PMC - PubMed
    1. Hobensack M, Ojo M, Barrón Y, et al. Documentation of hospitalization risk factors in electronic health records (EHRs): a qualitative study with home healthcare clinicians. J Am Med Inform Assoc. 2022 Apr 13;29(5):805–812. doi: 10.1093/jamia/ocac023. doi. Medline. - DOI - PMC - PubMed

LinkOut - more resources