MATEC Web of Conferences (Jan 2018)

An exploration of text mining of narrative reports of injury incidents to assess risk

  • Passmore David,
  • Chae Chungil,
  • Kustikova Yulia,
  • Baker Rose,
  • Yim Jeong-Ha

DOI
https://doi.org/10.1051/matecconf/201825106020
Journal volume & issue
Vol. 251
p. 06020

Abstract

Read online

A topic model was explored using unsupervised machine learning to summarized free-text narrative reports of 77,215 injuries that occurred in coal mines in the USA between 2000 and 2015. Latent Dirichlet Allocation modeling processes identified six topics from the free-text data. One topic, a theme describing primarily injury incidents resulting in strains and sprains of musculoskeletal systems, revealed differences in topic emphasis by the location of the mine property at which injuries occurred, the degree of injury, and the year of injury occurrence. Text narratives clustered around this topic refer most frequently to surface or other locations rather than underground locations that resulted in disability and that, also, increased secularly over time. The modeling success enjoyed in this exploratory effort suggests that additional topic mining of these injury text narratives is justified, especially using a broad set of covariates to explain variations in topic emphasis and for comparison of surface mining injuries with injuries occurring during site preparation for construction.