Meta-Learner for Amharic Sentiment Classification

Girma Neshir; Andreas Rauber; Solomon Atnafu

doi:10.3390/app11188489

Applied Sciences (Sep 2021)

Meta-Learner for Amharic Sentiment Classification

Girma Neshir,
Andreas Rauber,
Solomon Atnafu

Affiliations

Girma Neshir: IT Doctoral Program, Addis Ababa University, Addis Ababa P.O. Box 28762, Ethiopia
Andreas Rauber: Institute of Information Systems Engineering, Technical University of Vienna, Favoritenstraße 9-11/194-01, A-1040 Vienna, Austria
Solomon Atnafu: Department of Computer Science, Addis Ababa University, Addis Ababa P.O. Box 1176, Ethiopia

DOI: https://doi.org/10.3390/app11188489
Journal volume & issue: Vol. 11, no. 18
p. 8489

Abstract

Read online

The emergence of the World Wide Web facilitates the growth of user-generated texts in less-resourced languages. Sentiment analysis of these texts may serve as a key performance indicator of the quality of services delivered by companies and government institutions. The presence of user-generated texts is an opportunity for assisting managers and policy-makers. These texts are used to improve performance and increase the level of customers’ satisfaction. Because of this potential, sentiment analysis has been widely researched in the past few years. A plethora of approaches and tools have been developed—albeit predominantly for well-resourced languages such as English. Resources for less-resourced languages such as, in this paper, Amharic, are much less developed. As a result, it requires cost-effective approaches and massive amounts of annotated training data, calling for different approaches to be applied. This research investigates the performance of a combination of heterogeneous machine learning algorithms (base learners such as SVM, RF, and NB). These models in the framework are fused by a meta-learner (in this case, logistic regression) for Amharic sentiment classification. An annotated corpus is provided for evaluation of the classification framework. The proposed stacked approach applying SMOTE on TF-IDF characters (1,7) grams features has achieved an accuracy of 90%. The overall results of the meta-learner (i.e., stack ensemble) have revealed performance rise over the base learners with TF-IDF character n-grams.

Published in Applied Sciences

ISSN: 2076-3417 (Online)
Publisher: MDPI AG
Country of publisher: Switzerland
LCC subjects: Technology: Engineering (General). Civil engineering (General); Science: Biology (General); Science: Physics; Science: Chemistry
Website: http://www.mdpi.com/journal/applsci

About the journal

Abstract

Keywords