A large dataset of annotated incident reports on medication errors

Zoie S. Y. Wong; Neil Waters; Jiaxing Liu; Shin Ushiro

doi:10.1038/s41597-024-03036-2

Scientific Data (Feb 2024)

A large dataset of annotated incident reports on medication errors

Zoie S. Y. Wong,
Neil Waters,
Jiaxing Liu,
Shin Ushiro

Affiliations

Zoie S. Y. Wong: Graduate School of Public Health, St. Luke’s International University
Neil Waters: Graduate School of Public Health, St. Luke’s International University
Jiaxing Liu: School of Statistics and Mathematics, Zhongnan University of Economics and Law
Shin Ushiro: Division of Patient Safety, Kyushu University Hospital

DOI: https://doi.org/10.1038/s41597-024-03036-2
Journal volume & issue: Vol. 11, no. 1
pp. 1 – 8

Abstract

Read online

Abstract Incident reports of medication errors are valuable learning resources for improving patient safety. However, pertinent information is often contained within unstructured free text, which prevents automated analysis and limits the usefulness of these data. Natural language processing can structure this free text automatically and retrieve relevant past incidents and learning materials, but to be able to do so requires a large, fully annotated and validated corpus of incident reports. We present a corpus of 58,658 machine-annotated incident reports of medication errors that can be used to advance the development of information extraction models and subsequent incident learning. We report the best F1-scores for the annotated dataset: 0.97 and 0.76 for named entity recognition and intention/factuality analysis, respectively, for the cross-validation exercise. Our dataset contains 478,175 named entities and differentiates between incident types by recognising discrepancies between what was intended and what actually occurred. We explain our annotation workflow and technical validation and provide access to the validation datasets and machine annotator for labelling future incident reports of medication errors.

Published in Scientific Data

ISSN: 2052-4463 (Online)
Publisher: Nature Portfolio
Country of publisher: United Kingdom
LCC subjects: Science
Website: https://www.nature.com/sdata/

About the journal