Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Observational Study
. 2025 Mar:102:111787.
doi: 10.1016/j.jclinane.2025.111787. Epub 2025 Feb 21.

Assessing the accuracy of ChatGPT in interpreting blood gas analysis results ChatGPT-4 in blood gas analysis

Affiliations
Observational Study

Assessing the accuracy of ChatGPT in interpreting blood gas analysis results ChatGPT-4 in blood gas analysis

Engin İhsan Turan et al. J Clin Anesth. 2025 Mar.

Abstract

Background: Arterial blood gas (ABG) analysis is a critical component of patient management in intensive care units (ICUs), operating rooms, and general wards, providing essential information on acid-base balance, oxygenation, and metabolic status. Interpretation requires a high level of expertise, potentially leading to variability in accuracy. This study explores the feasibility and accuracy of ChatGPT-4, an AI-based model, in interpreting ABG results compared to experienced anesthesiologists.

Methods: This prospective observational study, approved by the institutional ethics board, included 400 ABG samples from ICU patients, anonymized and assessed by ChatGPT-4. The model analyzed parameters including acid-base status, oxygenation, hemoglobin levels, and metabolic markers, and provided both diagnostic and treatment recommendations. Two anesthesiologists, trained in ABG interpretation, independently evaluated the model's predictions to determine accuracy in potential diagnoses and treatment.

Results: ChatGPT-4 achieved high accuracy across most ABG parameters, with 100 % accuracy for pH, oxygenation, sodium, and chloride. Hemoglobin accuracy was 92.5 %, while bilirubin interpretation showed limitations at 72.5 %. In several cases, the model recommended unnecessary bicarbonate treatment, suggesting an area for improvement in clinical judgment for acid-base balance management. The model's overall performance was statistically significant across most parameters (p < 0.05).

Discussion: ChatGPT-4 demonstrated potential as a supplementary tool for ABG interpretation in high-demand clinical settings, supporting rapid, reliable decision-making. However, the model's limitations in interpreting complex metabolic markers highlight the need for clinician oversight. Future refinements should focus on enhancing AI training for nuanced metabolic interpretation, particularly for markers like bilirubin, to ensure safe and effective application across diverse clinical contexts.

Keywords: Acid-base imbalance; Arterial blood gas analysis; Artificial intelligence; Clinical decision support; Oxygenation.

PubMed Disclaimer

Conflict of interest statement

Declaration of competing interest The authors declare that they have not received any financial support for the conduct of this research or the preparation of this manuscript. Furthermore, there are no conflicts of interest to disclose by any of the authors.

Publication types

LinkOut - more resources