Complexity (Jan 2018)

Parallel Attribute Reduction Algorithm for Complex Heterogeneous Data Using MapReduce

  • Tengfei Zhang,
  • Fumin Ma,
  • Jie Cao,
  • Chen Peng,
  • Dong Yue

DOI
https://doi.org/10.1155/2018/8291650
Journal volume & issue
Vol. 2018

Abstract

Read online

Parallel attribute reduction is one of the most important topics in current research on rough set theory. Although some parallel algorithms were well documented, most of them are still faced with some challenges for effectively dealing with the complex heterogeneous data including categorical and numerical attributes. Aiming at this problem, a novel attribute reduction algorithm based on neighborhood multigranulation rough sets was developed to process the massive heterogeneous data in the parallel way. The MapReduce-based parallelization method for attribute reduction was proposed in the framework of neighborhood multigranulation rough sets. To improve the reduction efficiency, the hashing Map/Reduce functions were designed to speed up the positive region calculation. Thereafter, a quick parallel attribute reduction algorithm using MapReduce was developed. The effectiveness and superiority of this parallel algorithm were demonstrated by theoretical analysis and comparison experiments.