IEEE Access (Jan 2016)
Multivariate Beta Mixture Model for Automatic Identification of Topical Authoritative Users in Community Question Answering Sites
Abstract
Community question answering (CQA) site is an online community to provide valuable information in wide variety of topics in question-answer form to users'. The major problem with CQA lies in identifying the authoritative users in the domain of the question so as to route the question to right experts and selecting the best answer etc. The existing work suffers from one or more limitations such as: 1) lack of automatic mechanism to distinguish between authoritative and non-authoritative users in specified topics; 2) the high dependence on its training data in supervised learning which is too time-consuming process to obtain labeled samples of data manually; and 3) some approaches rely on using some cutoff parameters to estimate an authority score. In this paper, a parameterless mixture model approach is proposed to identify topical authoritative users to overcome the above-mentioned limitations. The statistical framework based on multivariate beta mixtures is utilized on feature vector of users' which is composed of information related to user activities on CQA site. The probability density function is therefore devised and the beta mixture component that corresponds to the most authoritative user is identified. The suitability of the proposed approach is illustrated on real data of two CQA sites: StackOverflow and AskUbuntu. The result shows that the proposed model is remarkable in identifying the authoritative users in comparison with conventional classifiers and Gaussian mixture model.
Keywords