A Regularized Attribute Weighting Framework for Naive Bayes

Shihe Wang; Jianfeng Ren; Ruibin Bai

doi:10.1109/ACCESS.2020.3044946

IEEE Access (Jan 2020)

A Regularized Attribute Weighting Framework for Naive Bayes

Shihe Wang,
Jianfeng Ren,
Ruibin Bai

Affiliations

Shihe Wang: ORCiD; School of Computer Science, University of Nottingham Ningbo China, Ningbo, China
Jianfeng Ren: ORCiD; School of Computer Science, University of Nottingham Ningbo China, Ningbo, China
Ruibin Bai: ORCiD; School of Computer Science, University of Nottingham Ningbo China, Ningbo, China

DOI: https://doi.org/10.1109/ACCESS.2020.3044946
Journal volume & issue: Vol. 8
pp. 225639 – 225649

Abstract

Read online

The Bayesian classification framework has been widely used in many fields, but the covariance matrix is usually difficult to estimate reliably. To alleviate the problem, many naive Bayes (NB) approaches with good performance have been developed. However, the assumption of conditional independence between attributes in NB rarely holds in reality. Various attribute-weighting schemes have been developed to address this problem. Among them, class-specific attribute weighted naive Bayes (CAWNB) has recently achieved good performance by using classification feedback to optimize the attribute weights of each class. However, the derived model may be over-fitted to the training dataset, especially when the dataset is insufficient to train a model with good generalization performance. This paper proposes a regularization technique to improve the generalization capability of CAWNB, which could well balance the trade-off between discrimination power and generalization capability. More specifically, by introducing the regularization term, the proposed method, namely regularized naive Bayes (RNB), could well capture the data characteristics when the dataset is large, and exhibit good generalization performance when the dataset is small. RNB is compared with the state-of-the-art naive Bayes methods. Experiments on 33 machine-learning benchmark datasets demonstrate that RNB outperforms the compared methods significantly.

Published in IEEE Access

ISSN: 2169-3536 (Online)
Publisher: IEEE
Country of publisher: United States
LCC subjects: Technology: Electrical engineering. Electronics. Nuclear engineering
Website: https://ieeexplore.ieee.org/xpl/RecentIssue.jsp?punumber=6287639

About the journal

Abstract

Keywords