Feature Selection for Partially Labeled Data Based on Neighborhood Granulation Measures

Bingyang Li; Jianmei Xiao; Xihuai Wang

doi:10.1109/ACCESS.2019.2903845

IEEE Access (Jan 2019)

Feature Selection for Partially Labeled Data Based on Neighborhood Granulation Measures

Bingyang Li,
Jianmei Xiao,
Xihuai Wang

Affiliations

Bingyang Li: ORCiD; Department of Electrical Engineering, Shanghai Maritime University, Shanghai, China
Jianmei Xiao: Department of Electrical Engineering, Shanghai Maritime University, Shanghai, China
Xihuai Wang: Department of Electrical Engineering, Shanghai Maritime University, Shanghai, China

DOI: https://doi.org/10.1109/ACCESS.2019.2903845
Journal volume & issue: Vol. 7
pp. 37238 – 37250

Abstract

Read online

As an effective feature selection technique, rough set theory plays an important part in machine learning. However, it is only applicable to labeled data. In reality, there are massive partially labeled data in machine learning tasks, such as webpage classification, speech recognition, and text categorization. To effectively remove redundant features of partially labeled data, the neighborhood granulation measures based on a neighborhood rough set model are put forward in this paper, which can be used to evaluate the discernibility ability of feature subsets under both information systems and decision systems. Moreover, a new definition of significance is introduced. Based on that, a semisupervised reduction algorithm is presented for the feature selection of partially labeled data. Several datasets are chosen to verify its effectiveness. The comparative experiments show that our proposed method is more effective and applicable to the feature selection of partially labeled data.

Published in IEEE Access

ISSN: 2169-3536 (Online)
Publisher: IEEE
Country of publisher: United States
LCC subjects: Technology: Electrical engineering. Electronics. Nuclear engineering
Website: https://ieeexplore.ieee.org/xpl/RecentIssue.jsp?punumber=6287639

About the journal

Abstract

Keywords