. 2024 Dec 12;19(12):e0312078.

doi: 10.1371/journal.pone.0312078. eCollection 2024.

Evaluating generative artificial intelligence's limitations in health policy identification and interpretation

Rory Wilson¹, Ciara M Weets¹, Amanda Rosner¹, Rebecca Katz¹

Affiliations

PMID: 39666618
PMCID: PMC11637257
DOI: 10.1371/journal.pone.0312078

Evaluating generative artificial intelligence's limitations in health policy identification and interpretation

Rory Wilson et al. PLoS One. 2024.

. 2024 Dec 12;19(12):e0312078.

doi: 10.1371/journal.pone.0312078. eCollection 2024.

Authors

Rory Wilson¹, Ciara M Weets¹, Amanda Rosner¹, Rebecca Katz¹

Affiliation

¹ Georgetown University Center for Global Health Science and Security, Washington, DC, United States of America.

PMID: 39666618
PMCID: PMC11637257
DOI: 10.1371/journal.pone.0312078

Abstract

Policy epidemiology utilizes human subject-matter experts (SMEs) to systematically surface, analyze, and categorize legally-enforceable policies. The Analysis and Mapping of Policies for Emerging Infectious Diseases project systematically collects and assesses health-related policies from all United Nations Member States. The recent proliferation of generative artificial intelligence (GAI) tools powered by large language models have led to suggestions that such technologies be incorporated into our project and similar research efforts to decrease the human resources required. To test the accuracy and precision of GAI in identifying and interpreting health policies, we designed a study to systematically assess the responses produced by a GAI tool versus those produced by a SME. We used two validated policy datasets, on emergency and childhood vaccination policy and quarantine and isolation policy in each United Nations Member State. We found that the SME and GAI tool were concordant 78.09% and 67.01% of the time respectively. It also significantly hastened the data collection processes. However, our analysis of non-concordant results revealed systematic inaccuracies and imprecision across different World Health Organization regions. Regarding vaccination, over 50% of countries in the African, Southeast Asian, and Eastern Mediterranean regions were inaccurately represented in GAI responses. This trend was similar for quarantine and isolation, with the African and Eastern Mediterranean regions least concordant. Furthermore, GAI responses only provided laws or information missed by the SME 2.14% and 2.48% of the time for the vaccination dataset and for the quarantine and isolation dataset, respectively. Notably, the GAI was least concordant with the SME when tasked with policy interpretation. These results suggest that GAI tools require further development to accurately identify policies across diverse global regions and interpret context-specific information. However, we found that GAI is a useful tool for quality assurance and quality control processes in health policy identification.

Copyright: This is an open access article, free of all copyright, and may be freely reproduced, distributed, transmitted, modified, built upon, or otherwise used by anyone for any lawful purpose. The work is made available under the Creative Commons CC0 public domain dedication.

PubMed Disclaimer

Conflict of interest statement

The authors have declared that no competing interests exist.

Figures

**Fig 1. Decision and coding tree for vaccination methodology.**
This decision tree, read top to bottom, was used across all UN Member States. For each country, query terms were used, and, after exhausting all query terms, the aggregate responses were used to make decisions according to this standardized tree. All possible responses result in a coding directive, which are color coded at the base of the tree. Non-concordant results were validated by an independent researcher to determine whether the SME or GAI was correct.

**Fig 2. Decision and coding tree for quarantine and isolation law identification and interpretation.**
This decision tree, read top to bottom, was used across all UN Member States. For each country, query terms were used, and, after exhausting all query terms, the aggregate responses were used to make decisions according to this standardized tree. All possible responses result in a coding directive.

**Fig 3. Unfiltered vaccination concordance rates per WHO region.**

**Fig 4. Filtered vaccination concordance rates per WHO region.**

**Fig 5. Maps of the concordance between SME research team and GAI tool on routine and emergency vaccination policies in each UN Member State.**
Panel A includes data on routine childhood vaccination policies, while panel B includes data on emergency powers for vaccination.

**Fig 6. Quarantine and Isolation concordance rates per WHO region.**

**Fig 7. Maps of the concordance between SME research team and GAI tool on quarantine and isolation policies in each UN Member State.**
Panel A includes data on isolation policies which were surfaced through the first query in the series, while panel B includes data on quarantine policies surfaced by the sixth query of the series.

See this image and copyright information in PMC

References

1. Katz R. Policy Epidemiology: Identifying What Works in Outbreak Preparedness and Response. Health Affairs. 2023. Sep 14. Available from: https://www.healthaffairs.org/content/forefront/policy-epidemiology-iden...
1. Katz R, Graeden E, Kerr J, Eaneff S. Tracking the flow of policy: Applying a new approach for tracking the flow of health policy. Milbank Q. 2023;101(3):632–652.
1. Weets CM, Katz R. Global approaches to tackling antimicrobial resistance: a comprehensive analysis of water, sanitation and hygiene policies. BMJ Glob Health. 2024;9(2):e013855. doi: 10.1136/bmjgh-2023-013855 - DOI - PMC - PubMed
1. Ljungqvist GV, Weets CM, Stevens T, Robertson H, Zimmerman R, Graeden E, et al.. Global Patterns in Access and Benefit-Sharing: A Comprehensive Review of National Policies. medRxiv [Preprint]. 2024. Jul 12:2024.07.12.24310347.
1. Open AI, Achiam J, Adler S, Agarwal S, Ahmad L, Akkaya I, et al.. GPT-4 Technical Report. arXiv. 2023. Mar 15:2303.08774.

MeSH terms

Actions
Actions
Actions
Actions
Actions
Actions

LinkOut - more resources

Full Text Sources
- PubMed Central
- Public Library of Science

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Evaluating generative artificial intelligence's limitations in health policy identification and interpretation

Affiliation

Evaluating generative artificial intelligence's limitations in health policy identification and interpretation

Authors

Affiliation

Abstract

Conflict of interest statement

Figures

References

MeSH terms

LinkOut - more resources

Full Text Sources