Linguistic Analysis of Hindi-English Mixed Tweets for Depression Detection

Carmel Mary Belinda M J; Ravikumar S; Muhammad Arif; Dhilip Kumar V; Antony Kumar K; Arulkumaran G

doi:10.1155/2022/3225920

Journal of Mathematics (Jan 2022)

Linguistic Analysis of Hindi-English Mixed Tweets for Depression Detection

Carmel Mary Belinda M J,
Ravikumar S,
Muhammad Arif,
Dhilip Kumar V,
Antony Kumar K,
Arulkumaran G

Affiliations

Carmel Mary Belinda M J: Department of Computer Science & Engineering
Ravikumar S: Department of Computer Science & Engineering
Muhammad Arif: Department of Computer Science and Information Technology
Dhilip Kumar V: Department of Computer Science & Engineering
Antony Kumar K: Department of Computer Science & Engineering
Arulkumaran G: Department of Electrical and Computer Engineering

DOI: https://doi.org/10.1155/2022/3225920
Journal volume & issue: Vol. 2022

Abstract

Read online

According to recent studies, young adults in India faced mental health issues due to closures of universities and loss of income, low self-esteem, distress, and reported symptoms of anxiety and/or depressive disorder (43%). This makes it a high time to come up with a solution. A new classifier proposed to find those individuals who might be having depression based on their tweets from the social media platform Twitter. The proposed model is based on linguistic analysis and text classification by calculating probability using the TF∗IDF (term frequency-inverse document frequency). Indians tend to tweet predominantly using English, Hindi, or a mix of these two languages (colloquially known as Hinglish). In this proposed approach, data has been collected from Twitter and screened via passing them through a classifier built using the multinomial Naive Bayes algorithm and grid search, the latter being used for hyperparameter optimization. Each tweet is classified as depressed or not depressed. The entire architecture works over English and Hindi languages, which shall help in implementation globally and across multiple platforms and help in putting a stop to the ever-increasing depression rates in a methodical and automated manner. In the proposed model pipeline, composed techniques are used to get the better results, as 96.15% accuracy and 0.914 as the F1 score have been attained.

Published in Journal of Mathematics

ISSN: 2314-4629 (Print); 2314-4785 (Online)
Publisher: Wiley
Country of publisher: United Kingdom
LCC subjects: Science: Mathematics
Website: https://onlinelibrary.wiley.com/journal/1469

About the journal