Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2025 May 23;17(2):10138.
doi: 10.4081/dr.2024.10138. Epub 2024 Nov 28.

Application of ChatGPT as a content generation tool in continuing medical education: acne as a test topic

Affiliations

Application of ChatGPT as a content generation tool in continuing medical education: acne as a test topic

Luigi Naldi et al. Dermatol Reports. .

Abstract

The large language model (LLM) ChatGPT can answer open-ended and complex questions, but its accuracy in providing reliable medical information requires a careful assessment. As part of the AI-CHECK (Artificial Intelligence for CME Health E-learning Contents and Knowledge) study, aimed at evaluating the potential of ChatGPT in continuous medical education (CME), we compared ChatGPT-generated educational content to the recommendations of the National Institute for Health and Care Excellence (NICE) guidelines on acne vulgaris. ChatGPT version 4 was exposed to a 23-item questionnaire developed by an experienced dermatologist. A panel of five dermatologists rated the answers positively in terms of "quality" (87.8%), "readability" (94.8%), "accuracy" (75.7%), "thoroughness" (85.2%), and "consistency" with guidelines (76.8%). The references provided by ChatGPT obtained positive ratings for "pertinence" (94.6%), "relevance" (91.2%), and "update" (62.3%). The internal reproducibility was adequate both for answers (93.5%) and references (67.4%). Answers related to issues of uncertainty and/or controversy in the scientific community scored the lowest. This study underscores the need to develop rigorous evaluation criteria for AI-generated medical content and for expert oversight to ensure accuracy and guideline adherence.

PubMed Disclaimer

Conflict of interest statement

The authors declare that they have no competing interests.

Figures

Figure 1.
Figure 1.
Stacked bar chart of overall evaluators’ judgments of questionnaire answers for each domain.
Figure 2.
Figure 2.
Radar chart of median evaluators’ judgments of questionnaire answers for each domain. Questions not assessable due to the lack of or limited discussion in guidelines were removed from the domain “consistency”.
Figure 3.
Figure 3.
Radar chart of overall positive evaluators’ judgments of questionnaire references provided for each answer and for each domain investigated.

References

    1. OpenAI. ChatGPT. https://chat.openai.com/chat
    1. Noy S, Zhang W. Experimental evidence on the productivity effects of generative artificial intelligence. Science 2023;381:187-92. - PubMed
    1. Meskó B, Topol EJ. The imperative for regulatory oversight of large language models (or generative AI) in healthcare. NPJ Digit Med 2023;6:120. - PMC - PubMed
    1. Dave T, Athaluri SA, Singh S. ChatGPT in medicine: an overview of its applications, advantages, limitations, future prospects, and ethical considerations. Front Artif Intell 2023;6:1169595. - PMC - PubMed
    1. Safranek CW, Sidamon-Eristoff AE, Gilson A, Chartash D. The role of large language models in medical education: applications and implications. JMIR Med Educ 2023;9:e50945. - PMC - PubMed