Journal of Information and Telecommunication (Apr 2018)

Imbalanced data classification using MapReduce and relief

  • Joanna Jedrzejowicz,
  • Robert Kostrzewski,
  • Jakub Neumann,
  • Magdalena Zakrzewska

DOI
https://doi.org/10.1080/24751839.2018.1440454
Journal volume & issue
Vol. 2, no. 2
pp. 217 – 230

Abstract

Read online

Classification of imbalanced data has been reported to require modification of standard classification algorithms and lately has attracted a lot of attention due to practical applications in industry, banking and finance. The aim of the paper is to examine algorithms known from literature when two modifications are introduced: MapReduce to parallelize computations and Relief to select most valuable attributes. Both modifications are needed in Big Data area. Also two new algorithms are considered.

Keywords