Detecting sarcasm in multi-domain datasets using convolutional neural networks and long short term memory network model

Ramish Jamil; Imran Ashraf; Furqan Rustam; Eysha Saad; Arif Mehmood; Gyu Sang Choi

doi:10.7717/peerj-cs.645

PeerJ Computer Science (Aug 2021)

Detecting sarcasm in multi-domain datasets using convolutional neural networks and long short term memory network model

Ramish Jamil,
Imran Ashraf,
Furqan Rustam,
Eysha Saad,
Arif Mehmood,
Gyu Sang Choi

Affiliations

Ramish Jamil: Khwaja Fareed University of Engineering and Information Technology, Rahim Yar Khan, Pakistan
Imran Ashraf: Information and Communication Engineering, Yeungnam University, Gyeongsan si, Daegu, South Korea
Furqan Rustam: Khwaja Fareed University of Engineering and Information Technology, Rahim Yar Khan, Pakistan
Eysha Saad: Khwaja Fareed University of Engineering and Information Technology, Rahim Yar Khan, Pakistan
Arif Mehmood: The Islamia University of Bahawalpur, Bahawalpur, Pakistan
Gyu Sang Choi: Information and Communication Engineering, Yeungnam University, Gyeongsan si, Daegu, South Korea

DOI: https://doi.org/10.7717/peerj-cs.645
Journal volume & issue: Vol. 7
p. e645

Abstract

Read online Read online

Sarcasm emerges as a common phenomenon across social networking sites because people express their negative thoughts, hatred and opinions using positive vocabulary which makes it a challenging task to detect sarcasm. Although various studies have investigated the sarcasm detection on baseline datasets, this work is the first to detect sarcasm from a multi-domain dataset that is constructed by combining Twitter and News Headlines datasets. This study proposes a hybrid approach where the convolutional neural networks (CNN) are used for feature extraction while the long short-term memory (LSTM) is trained and tested on those features. For performance analysis, several machine learning algorithms such as random forest, support vector classifier, extra tree classifier and decision tree are used. The performance of both the proposed model and machine learning algorithms is analyzed using the term frequency-inverse document frequency, bag of words approach, and global vectors for word representations. Experimental results indicate that the proposed model surpasses the performance of the traditional machine learning algorithms with an accuracy of 91.60%. Several state-of-the-art approaches for sarcasm detection are compared with the proposed model and results suggest that the proposed model outperforms these approaches concerning the precision, recall and F1 scores. The proposed model is accurate, robust, and performs sarcasm detection on a multi-domain dataset.

Published in PeerJ Computer Science

ISSN: 2376-5992 (Online)
Publisher: PeerJ Inc.
Country of publisher: United States
LCC subjects: Science: Mathematics: Instruments and machines: Electronic computers. Computer science
Website: https://peerj.com/computer-science/

About the journal

Abstract

Keywords