IEEE Access (Jan 2023)

Predicting Drug Review Polarity Using the Combination Model of Multi-Sense Word Embedding and Fuzzy Latent Dirichlet Allocation (FLDA)

  • Siyue Song,
  • Anju P. Johnson

DOI
https://doi.org/10.1109/ACCESS.2023.3326757
Journal volume & issue
Vol. 11
pp. 118538 – 118546

Abstract

Read online

The massive volume of textual data generated in recent years has led to the development of new computer-based technologies, especially in the field of healthcare area. Sentiment analysis opens a new door in healthcare to improve public health data analysis and efficiently predict diseases. Many words in natural language have multiple meanings or senses. However, traditional algorithms mainly focus on a single meaning but cannot capture the multiple senses of the words, leading to potential inaccuracies in sentiment analysis. Additionally, dealing with vagueness in linguistic terms is a common challenge in natural language processing; particularly, applying simple frequency terms is insufficient to measure the development states of different topics. In this research, we applied two multi-sense word embedding models, Probabilistic Fasttext and Multi-sense Skip-gram, to the sentiment analysis of drug reviews. The proposed models can better represent words with multiple meanings, producing more accurate sentiment analysis results. Additionally, we compared multi-sense word embedding with single embedding models and evaluated the classification methods compared to other classical machine learning technologies. Finally, the Fuzzy system was applied to estimate the topics hidden in the drug review dataset using the Latent Dirichlet Allocation (LDA) model; the Fuzzy rule-based system was applied to explain the classification result of drug review polarity. In particular, both models can have good performances during the classification task. Probabilistic Fasttext achieved an accuracy of 82.1%, and multi-sense skip-gram achieved an accuracy of 79.8%. The work has addressed several critical challenges related to sentiment analysis of healthcare data and has proposed a comprehensive approach to tackle them. The reported results indicate promising performance and the potential future applications in other medical domains beyond drug reviews further highlight the significance of this research.

Keywords