IEEE Access (Jan 2019)

A Novel Parallel Biclustering Approach and Its Application to Identify and Segment Highly Profitable Telecom Customers

  • Qin Lin,
  • Huailing Zhang,
  • Xizhao Wang,
  • Yun Xue,
  • Hongxin Liu,
  • Changwei Gong

DOI
https://doi.org/10.1109/ACCESS.2019.2898644
Journal volume & issue
Vol. 7
pp. 28696 – 28711

Abstract

Read online

Identifying and segmenting various kinds of highly profitable customers is a critical issue for telecom enterprises. However, the continual increase in the dimension and the volume of data makes traditional approaches inefficient and even unfeasible. To overcome these problems, a novel statistically motivated parallel large sum submatrix biclustering algorithm based on Spark MapReduce (SP-PLSS) is proposed in this paper. Different from traditional approaches, the SP-PLSS is driven by a newly proposed bicluster model, and clusters both customer samples and consumer attributes simultaneously so that it could finely identify and segment the highly profitable customers who share similarly upscale purchasing behavior on a small fraction of attributes. Furthermore, with the implementation of the MapReduce framework on a Spark platform, the SP-PLSS significantly improves the efficiency and scalability of handling the large dataset. The extensive experiments on a real-world telecom consumption data and synthetic large datasets show that, in comparison with other competing algorithms, the SP-PLSS could provide operators with a comparatively advanced, scalable, and feasible solution in identifying and segmenting highly profitable telecom customers with superior clustering results.

Keywords