Multiclass Boosting with Adaptive Group-Based kNN and Its Application in Text Categorization

Mathematical Problems in Engineering. 2012;2012 DOI 10.1155/2012/793490

 

Journal Homepage

Journal Title: Mathematical Problems in Engineering

ISSN: 1024-123X (Print); 1563-5147 (Online)

Publisher: Hindawi Publishing Corporation

LCC Subject Category: Technology: Engineering (General). Civil engineering (General) | Science: Mathematics

Country of publisher: Egypt

Language of fulltext: English

Full-text formats available: PDF, HTML, ePUB, XML

 

AUTHORS

Lei La (School of Automation, Beijing Institute of Technology, Beijing 100081, China)
Qiao Guo (School of Automation, Beijing Institute of Technology, Beijing 100081, China)
Dequan Yang (School of Automation, Beijing Institute of Technology, Beijing 100081, China)
Qimin Cao (School of Automation, Beijing Institute of Technology, Beijing 100081, China)

EDITORIAL INFORMATION

Blind peer review

Editorial Board

Instructions for authors

Time From Submission to Publication: 26 weeks

 

Abstract | Full Text

AdaBoost is an excellent committee-based tool for classification. However, its effectiveness and efficiency in multiclass categorization face the challenges from methods based on support vector machine (SVM), neural networks (NN), naïve Bayes, and k-nearest neighbor (kNN). This paper uses a novel multi-class AdaBoost algorithm to avoid reducing the multi-class classification problem to multiple two-class classification problems. This novel method is more effective. In addition, it keeps the accuracy advantage of existing AdaBoost. An adaptive group-based kNN method is proposed in this paper to build more accurate weak classifiers and in this way control the number of basis classifiers in an acceptable range. To further enhance the performance, weak classifiers are combined into a strong classifier through a double iterative weighted way and construct an adaptive group-based kNN boosting algorithm (AGkNN-AdaBoost). We implement AGkNN-AdaBoost in a Chinese text categorization system. Experimental results showed that the classification algorithm proposed in this paper has better performance both in precision and recall than many other text categorization methods including traditional AdaBoost. In addition, the processing speed is significantly enhanced than original AdaBoost and many other classic categorization algorithms.