Determining the Efficiency of Drugs Under Special Conditions From Users&#x2019; Reviews on Healthcare Web Forums

Eysha Saad; Sadia Din; Ramish Jamil; Furqan Rustam; Arif Mehmood; Imran Ashraf; Gyu Sang Choi

doi:10.1109/ACCESS.2021.3088838

IEEE Access (Jan 2021)

Determining the Efficiency of Drugs Under Special Conditions From Users’ Reviews on Healthcare Web Forums

Eysha Saad,
Sadia Din,
Ramish Jamil,
Furqan Rustam,
Arif Mehmood,
Imran Ashraf,
Gyu Sang Choi

Affiliations

Eysha Saad: ORCiD; Department of Computer Science, Khwaja Fareed University of Engineering and Information Technology, Rahim Yar Khan, Pakistan
Sadia Din: ORCiD; Department of Information and Communication Engineering, Yeungnam University, Gyeongsan, South Korea
Ramish Jamil: Department of Computer Science, Khwaja Fareed University of Engineering and Information Technology, Rahim Yar Khan, Pakistan
Furqan Rustam: ORCiD; Department of Computer Science, Khwaja Fareed University of Engineering and Information Technology, Rahim Yar Khan, Pakistan
Arif Mehmood: ORCiD; Department of Computer Science and Information Technology, The Islamia University of Bahawalpur, Bahawalpur, Pakistan
Imran Ashraf: ORCiD; Department of Information and Communication Engineering, Yeungnam University, Gyeongsan, South Korea
Gyu Sang Choi: ORCiD; Department of Information and Communication Engineering, Yeungnam University, Gyeongsan, South Korea

DOI: https://doi.org/10.1109/ACCESS.2021.3088838
Journal volume & issue: Vol. 9
pp. 85721 – 85737

Abstract

Read online

Sentiment analysis is the extraction and categorization of sentiments that have been expressed in text data using text analysis techniques. Manifested by earlier studies, sentiment analysis of drug reviews has a large potential for providing valuable insights to assist healthcare professionals and companies for evaluating the safety of drugs after it has been marketed. Such insights help safeguard patients and increase their trust in medical companies. The existing systems either follow a lexicon-based approach or a learning-based approach for sentiment analysis in the medical domain. Learning-based techniques require annotated data while lexicon-based techniques tend to be domain-specific which restricts their wide use. This research embarks on a hybrid technique that utilizes both learning-based and lexicon-based approaches to achieve better results. General-purpose sentiment lexicons, such as AFFIN, TextBlob, and VADER, are used for annotating the reviews. Furthermore, several feature engineering techniques, such as term frequency (TF), term frequency-inverse document frequency (TF-IDF), and union of TF and TF-IDF (TF U TF-IDF) have been incorporated for the extraction of useful features. Finally, the learning models including logistic regression (LR), AdaBoost classifier (AB), random forest (RF), extra tree classifier (ETC), and multilayer perceptron (MLP) are used to classify sentiments of the reviews. The performance of the proposed hybrid approach is evaluated using accuracy, precision, recall, and F1-score. Experimental results indicate that the combination of learning-based and lexicon-based approaches provide improved results than their individual use. Moreover, TextBlob has shown promising results giving an accuracy of 96% with MLP when used with TF-IDF and with LR when used with TF U TF-IDF.

Published in IEEE Access

ISSN: 2169-3536 (Online)
Publisher: IEEE
Country of publisher: United States
LCC subjects: Technology: Electrical engineering. Electronics. Nuclear engineering
Website: https://ieeexplore.ieee.org/xpl/RecentIssue.jsp?punumber=6287639

About the journal

Abstract

Keywords