Journal of Medical Internet Research (Nov 2020)

Detection of Suicidality Among Opioid Users on Reddit: Machine Learning–Based Approach

  • Yao, Hannah,
  • Rashidian, Sina,
  • Dong, Xinyu,
  • Duanmu, Hongyi,
  • Rosenthal, Richard N,
  • Wang, Fusheng

DOI
https://doi.org/10.2196/15293
Journal volume & issue
Vol. 22, no. 11
p. e15293

Abstract

Read online

BackgroundIn recent years, both suicide and overdose rates have been increasing. Many individuals who struggle with opioid use disorder are prone to suicidal ideation; this may often result in overdose. However, these fatal overdoses are difficult to classify as intentional or unintentional. Intentional overdose is difficult to detect, partially due to the lack of predictors and social stigmas that push individuals away from seeking help. These individuals may instead use web-based means to articulate their concerns. ObjectiveThis study aimed to extract posts of suicidality among opioid users on Reddit using machine learning methods. The performance of the models is derivative of the data purity, and the results will help us to better understand the rationale of these users, providing new insights into individuals who are part of the opioid epidemic. MethodsReddit posts between June 2017 and June 2018 were collected from r/suicidewatch, r/depression, a set of opioid-related subreddits, and a control subreddit set. We first classified suicidal versus nonsuicidal languages and then classified users with opioid usage versus those without opioid usage. Several traditional baselines and neural network (NN) text classifiers were trained using subreddit names as the labels and combinations of semantic inputs. We then attempted to extract out-of-sample data belonging to the intersection of suicide ideation and opioid abuse. Amazon Mechanical Turk was used to provide labels for the out-of-sample data. ResultsClassification results were at least 90% across all models for at least one combination of input; the best classifier was convolutional neural network, which obtained an F1 score of 96.6%. When predicting out-of-sample data for posts containing both suicidal ideation and signs of opioid addiction, NN classifiers produced more false positives and traditional methods produced more false negatives, which is less desirable for predicting suicidal sentiments. ConclusionsOpioid abuse is linked to the risk of unintentional overdose and suicide risk. Social media platforms such as Reddit contain metadata that can aid machine learning and provide information at a personal level that cannot be obtained elsewhere. We demonstrate that it is possible to use NNs as a tool to predict an out-of-sample target with a model built from data sets labeled by characteristics we wish to distinguish in the out-of-sample target.