Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2024 Feb 29;11(1):260.
doi: 10.1038/s41597-024-03036-2.

A large dataset of annotated incident reports on medication errors

Affiliations

A large dataset of annotated incident reports on medication errors

Zoie S Y Wong et al. Sci Data. .

Abstract

Incident reports of medication errors are valuable learning resources for improving patient safety. However, pertinent information is often contained within unstructured free text, which prevents automated analysis and limits the usefulness of these data. Natural language processing can structure this free text automatically and retrieve relevant past incidents and learning materials, but to be able to do so requires a large, fully annotated and validated corpus of incident reports. We present a corpus of 58,658 machine-annotated incident reports of medication errors that can be used to advance the development of information extraction models and subsequent incident learning. We report the best F1-scores for the annotated dataset: 0.97 and 0.76 for named entity recognition and intention/factuality analysis, respectively, for the cross-validation exercise. Our dataset contains 478,175 named entities and differentiates between incident types by recognising discrepancies between what was intended and what actually occurred. We explain our annotation workflow and technical validation and provide access to the validation datasets and machine annotator for labelling future incident reports of medication errors.

PubMed Disclaimer

Conflict of interest statement

The authors declare no competing interests.

Figures

Fig. 1
Fig. 1
Data creation workflow. (a) Building the basic dataset, (b) building the machine annotator, which labels named entities, intention/factuality and incident types, and (c) technical validation via internal validation, external validation and error analysis.
Fig. 2
Fig. 2
Distribution of incident reports of medication errors (58,658). Y-axis: the number of reports. X-axis: the number of words.
Fig. 3
Fig. 3
Example of annotation of free text from an incident report. The legend indicates how to distinguish between types of named entity, the results of intention/factuality analysis and relation status. The type of incident inferred by the model is shown on the right.
Fig. 4
Fig. 4
Example of annotation of free text from an incident report. The legend indicates how to distinguish between types of named entity, the results of intention/factuality analysis and relation status. The type of incident inferred by the model is shown on the right.
Fig. 5
Fig. 5
The development of the multi-task BERT machine annotator. (a) Pre-training phase, (b) fine-tuning phase 1 and (c) fine-tuning phase 2.
Fig. 6
Fig. 6
An overview of an annotated incident report.
Fig. 7
Fig. 7
Technical validation summary. Reported F-1 scores for the cross validation, internal validation and external validation exercises.

Similar articles

Cited by

References

    1. Wong A, Plasek JM, Montecalvo SP, Zhou L. Natural language processing and its implications for the future of medication safety: a narrative review of recent advances and challenges. Pharmacotherapy. 2018;38:822–841. doi: 10.1002/phar.2151. - DOI - PubMed
    1. Patient safety incident reporting and learning systems: technical report and guidance. (World Health Organization, 2020).
    1. Global patient safety action plan 2021–2030: towards eliminating avoidable harm in health care. (World Health Organization, 2022).
    1. Keers RN, Williams SD, Cooke J, Ashcroft DM. Prevalence and nature of medication administration errors in health care settings: a systematic review of direct observational evidence. Ann. Pharmacother. 2013;47:237–256. doi: 10.1345/aph.1R147. - DOI - PubMed
    1. Makary MA, Daniel M. Medical error—the third leading cause of death in the US. BMJ. 2016;353:i2139. doi: 10.1136/bmj.i2139. - DOI - PubMed

MeSH terms