IEEE Access (Jan 2023)
Forecasting National-Level Self-Harm Trends With Social Networks
Abstract
Self-harm pertains to actions of self-inflicted poisoning or injury that lead to either non-fatal injuries or death, irrespective of the individual’s intention. Self-harm incidents not only cause loss to individuals but also incur a negative impact on the nation’s economy. Studies have demonstrated an increase in trends of self-harm that are correlated with the emergence of technological advancements and swift urban expansion in developing countries. The capacity to nowcast and forecast national-level patterns of self-harm trends could be imperative to policymakers and stakeholders in the public health sector, as it would enable them to implement prompt measures to counteract the underlying factors or avert these projected calamities. Prior research has utilized historical data to predict self-harm trends at the population level in various nations using conventional statistical forecasting methods. However, in some countries, such historical statistics may be challenging to obtain or insufficient for accurate prediction, impeding the ability to comprehend and project the national self-harm landscape in a timely manner. This paper proposes FAST, a framework designed to forecast self-harm patterns at the national level by analyzing mental signals obtained from a large volume of social media data. These signals serve as a proxy for real-world population mental health that could be used to enhance the forecastability of self-harm trends. Specifically, language-agnostic language models are first trained to extract different mental signals from collected social media messages. Then, these signals are aggregated and processed into multi-variate time series, on which the time-delay embedding algorithm is applied to transform into temporal embedded instances. Finally, various machine learning regressors are validated for their forecastability. The proposed method is validated through a case study in Thailand, which utilizes a set of 12 mental signals extracted from tweets to forecast death and injury cases resulting from self-harm. The results show that the proposed method outperformed the traditional ARIMA baseline by 43.56% and 36.48% on average in terms of MAPE on forecasting death and injury cases from self-harm, respectively. As far as current understanding permits, our research represents the initial exploration of utilizing aggregated social media information for the purposes of nowcasting and forecasting trends of self-harm on a nationwide scale. The results not only provide insight into improved forecasting techniques for self-harm trends but also establish a foundation for forthcoming social-network-driven applications that hinge on the capacity to predict socioeconomic factors.
Keywords