Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2024 Dec 12;19(12):e0312078.
doi: 10.1371/journal.pone.0312078. eCollection 2024.

Evaluating generative artificial intelligence's limitations in health policy identification and interpretation

Affiliations

Evaluating generative artificial intelligence's limitations in health policy identification and interpretation

Rory Wilson et al. PLoS One. .

Abstract

Policy epidemiology utilizes human subject-matter experts (SMEs) to systematically surface, analyze, and categorize legally-enforceable policies. The Analysis and Mapping of Policies for Emerging Infectious Diseases project systematically collects and assesses health-related policies from all United Nations Member States. The recent proliferation of generative artificial intelligence (GAI) tools powered by large language models have led to suggestions that such technologies be incorporated into our project and similar research efforts to decrease the human resources required. To test the accuracy and precision of GAI in identifying and interpreting health policies, we designed a study to systematically assess the responses produced by a GAI tool versus those produced by a SME. We used two validated policy datasets, on emergency and childhood vaccination policy and quarantine and isolation policy in each United Nations Member State. We found that the SME and GAI tool were concordant 78.09% and 67.01% of the time respectively. It also significantly hastened the data collection processes. However, our analysis of non-concordant results revealed systematic inaccuracies and imprecision across different World Health Organization regions. Regarding vaccination, over 50% of countries in the African, Southeast Asian, and Eastern Mediterranean regions were inaccurately represented in GAI responses. This trend was similar for quarantine and isolation, with the African and Eastern Mediterranean regions least concordant. Furthermore, GAI responses only provided laws or information missed by the SME 2.14% and 2.48% of the time for the vaccination dataset and for the quarantine and isolation dataset, respectively. Notably, the GAI was least concordant with the SME when tasked with policy interpretation. These results suggest that GAI tools require further development to accurately identify policies across diverse global regions and interpret context-specific information. However, we found that GAI is a useful tool for quality assurance and quality control processes in health policy identification.

PubMed Disclaimer

Conflict of interest statement

The authors have declared that no competing interests exist.

Figures

Fig 1
Fig 1. Decision and coding tree for vaccination methodology.
This decision tree, read top to bottom, was used across all UN Member States. For each country, query terms were used, and, after exhausting all query terms, the aggregate responses were used to make decisions according to this standardized tree. All possible responses result in a coding directive, which are color coded at the base of the tree. Non-concordant results were validated by an independent researcher to determine whether the SME or GAI was correct.
Fig 2
Fig 2. Decision and coding tree for quarantine and isolation law identification and interpretation.
This decision tree, read top to bottom, was used across all UN Member States. For each country, query terms were used, and, after exhausting all query terms, the aggregate responses were used to make decisions according to this standardized tree. All possible responses result in a coding directive.
Fig 3
Fig 3. Unfiltered vaccination concordance rates per WHO region.
Fig 4
Fig 4. Filtered vaccination concordance rates per WHO region.
Fig 5
Fig 5. Maps of the concordance between SME research team and GAI tool on routine and emergency vaccination policies in each UN Member State.
Panel A includes data on routine childhood vaccination policies, while panel B includes data on emergency powers for vaccination.
Fig 6
Fig 6. Quarantine and Isolation concordance rates per WHO region.
Fig 7
Fig 7. Maps of the concordance between SME research team and GAI tool on quarantine and isolation policies in each UN Member State.
Panel A includes data on isolation policies which were surfaced through the first query in the series, while panel B includes data on quarantine policies surfaced by the sixth query of the series.

References

    1. Katz R. Policy Epidemiology: Identifying What Works in Outbreak Preparedness and Response. Health Affairs. 2023. Sep 14. Available from: https://www.healthaffairs.org/content/forefront/policy-epidemiology-iden...
    1. Katz R, Graeden E, Kerr J, Eaneff S. Tracking the flow of policy: Applying a new approach for tracking the flow of health policy. Milbank Q. 2023;101(3):632–652.
    1. Weets CM, Katz R. Global approaches to tackling antimicrobial resistance: a comprehensive analysis of water, sanitation and hygiene policies. BMJ Glob Health. 2024;9(2):e013855. doi: 10.1136/bmjgh-2023-013855 - DOI - PMC - PubMed
    1. Ljungqvist GV, Weets CM, Stevens T, Robertson H, Zimmerman R, Graeden E, et al.. Global Patterns in Access and Benefit-Sharing: A Comprehensive Review of National Policies. medRxiv [Preprint]. 2024. Jul 12:2024.07.12.24310347.
    1. Open AI, Achiam J, Adler S, Agarwal S, Ahmad L, Akkaya I, et al.. GPT-4 Technical Report. arXiv. 2023. Mar 15:2303.08774.

LinkOut - more resources