Egyptian Informatics Journal (Nov 2019)
Ina-BWR: Indonesian bigram word rule for multi-label student complaints
Abstract
Handling multi-label student complaints is one of interesting research topics. One of techniques used for handling multi-label student complaints is Bag of Word (BoW) method. In this research bigram word rule and preprocess are proposed to increase the accuracy of multi-label classification results. To show the effectiveness of the proposed method, data from Telkom University student data and additional relevant data by using hashtag are used as testing data. We develop Indonesian Bigram Word Rule for Multi-label Student Complaints (Ina-BWR) to identify multi-label student problems based on Bigram Word Rule. Ina-BWR consists of three processes such as preprocessing informal text, identifying complaint and object from text. Additional preprocessing techniques are conducted to formalize the text such as parsing a hashtag, correcting affixes word, correcting a conjunction word, parsing suffix people pronoun and correcting typo words. Indonesian bigram word rule is adopted from opinion identification rules with 3 additional corpuses (-)NN, (-)JJ and (-)VB to identify student complaints. To identify complaints, four label corpuses have been created manually. The experimental results show that Ina-BWR can increase Personal, Subject and Relation label accuracies. The best accuracy for four labels is obtained when Ina-BWR is combined with BoW method. Keywords: Multi-label, Student complaints, Bag of word, Indonesian bigram word rule, Opinion identification rules