JMIR Mental Health (Jan 2024)

Developing a Framework to Infer Opioid Use Disorder Severity From Clinical Notes to Inform Natural Language Processing Methods: Characterization Study

  • Melissa N Poulsen,
  • Philip J Freda,
  • Vanessa Troiani,
  • Danielle L Mowery

DOI
https://doi.org/10.2196/53366
Journal volume & issue
Vol. 11
p. e53366

Abstract

Read online

BackgroundInformation regarding opioid use disorder (OUD) status and severity is important for patient care. Clinical notes provide valuable information for detecting and characterizing problematic opioid use, necessitating development of natural language processing (NLP) tools, which in turn requires reliably labeled OUD-relevant text and understanding of documentation patterns. ObjectiveTo inform automated NLP methods, we aimed to develop and evaluate an annotation schema for characterizing OUD and its severity, and to document patterns of OUD-relevant information within clinical notes of heterogeneous patient cohorts. MethodsWe developed an annotation schema to characterize OUD severity based on criteria from the Diagnostic and Statistical Manual of Mental Disorders, 5th edition. In total, 2 annotators reviewed clinical notes from key encounters of 100 adult patients with varied evidence of OUD, including patients with and those without chronic pain, with and without medication treatment for OUD, and a control group. We completed annotations at the sentence level. We calculated severity scores based on annotation of note text with 18 classes aligned with criteria for OUD severity and determined positive predictive values for OUD severity. ResultsThe annotation schema contained 27 classes. We annotated 1436 sentences from 82 patients; notes of 18 patients (11 of whom were controls) contained no relevant information. Interannotator agreement was above 70% for 11 of 15 batches of reviewed notes. Severity scores for control group patients were all 0. Among noncontrol patients, the mean severity score was 5.1 (SD 3.2), indicating moderate OUD, and the positive predictive value for detecting moderate or severe OUD was 0.71. Progress notes and notes from emergency department and outpatient settings contained the most and greatest diversity of information. Substance misuse and psychiatric classes were most prevalent and highly correlated across note types with high co-occurrence across patients. ConclusionsImplementation of the annotation schema demonstrated strong potential for inferring OUD severity based on key information in a small set of clinical notes and highlighting where such information is documented. These advancements will facilitate NLP tool development to improve OUD prevention, diagnosis, and treatment.