Scientific Reports (Nov 2024)
Topic modeling and content analysis of people’s anxiety-related concerns raised on a computer-mediated health platform
Abstract
Abstract Background About one in four Chinese people might suffer or have already suffered from anxiety conditions, with a lifetime prevalence rate of 4.8%. However, many of those who could have benefited from psychological or pharmacological treatments fail to be recognized or treated due to the lack of timely recognition and accurate diagnosis. Objective This study used a topic modeling approach to explore people’s anxiety-related concerns raised on a computer-mediated Chinese health platform, YOU WEN BI DA (questioning and answering), to provide implications for accurate diagnosis, targeted education, tailored intervention, and informed policy-making in the course of addressing this condition of public concern. Methods First, we extracted data from YOU WEN BI DA between May 2022 and February 2023. After cleaning the extracted data both using the Python text processing tool spaCy and manually, we ascertained the optimal number of topics by drawing on the coherence scores and used latent Dirichlet allocation (LDA) topic modeling to generate the most salient topics and related terms. We then categorized the topics ascertained into different classes of themes by plotting them onto a 2D plane via multidimensional scaling using the pyLDAvis visualization tool. Finally, we analyzed these topics and themes qualitatively to better understand people’s anxiety-related concerns. Results 5 topics with different overall prevalence were ascertained through data analysis. Topic 2 (tinnitus phobia-incurred concerns, n = 639) is the most popular dominant topic, occurring in 25.1% of the 2545 collected concerns, closely followed by Topics 1 (neurosis-incurred concerns, n = 512;) and 3 (sleep, dyskinesia, bipolar, cognitive, and somatic disorders-incurred concerns, n = 619), which appeared in 20.1% and 24.3% of the 2545 concerns respectively. Topic 5 (social phobia-incurred concerns, n = 428) ranks as the fourth most popular dominant topic, showing up in 16.8% of the 2545 concerns. Topic 4 (autonomic nerve dysfunction-incurred concerns, n = 347) accounts for 13.6% of the 2545 concerns. The t-distributed Stochastic Neighbor Embedding analysis reveals partial similarities between Topics 2 and 5 as well as between Topics 4 and 5 because many concerns involved in Topics 2 and 5 pertain to people’s psychological status of fear and anxiety and the relief and dispelling of such symptoms through medication, and many concerns involved in Topics 4 and 5 relate to people’s worries about the negative impact on their nerves and the adjustment and conditioning of such effects through medication. Conclusion This was the first study that investigated Chinese people’s anxiety-related concerns raised on YOU WEN BI DA using the topic modeling technique. The automatic text analysis and complementary manual interpretation of the collected data allowed for the discovery of the dominant topics hidden in the data and the categorization of these topics into different themes to reveal the overall status of people’s anxiety-related concerns. The research findings can provide some practice implications for health and medical educators, practitioners, and policy-makers to make joint efforts to address this common public concern effectively and efficiently.
Keywords