IEEE Access (Jan 2021)
Integrated Churn Prediction and Customer Segmentation Framework for Telco Business
Abstract
In the telco industry, attracting new customers is no longer a good strategy since the cost of retaining existing customers is much lower. Churn management becomes instrumental in the telco industry. As there is limited study combining churn prediction and customer segmentation, this paper aims to propose an integrated customer analytics framework for churn management. There are six components in the framework, including data pre-processing, exploratory data analysis (EDA), churn prediction, factor analysis, customer segmentation, and customer behaviour analytics. This framework integrates churn prediction and customer segmentation process to provide telco operators with a complete churn analysis to better manage customer churn. Three datasets are used in the experiments with six machine learning classifiers. First, the churn status of the customers is predicted using multiple machine learning classifiers. Synthetic Minority Oversampling Technique (SMOTE) is applied to the training set to deal with the problems with imbalanced datasets. The 10-fold cross-validation is used to assess the models. Accuracy and F1-score are used for model evaluation. F1-score is considered to be an important metric to measure the models for imbalanced datasets since the premise of churn management is to be able to identify customers who will churn. Experimental analysis indicates that AdaBoost performed the best in Dataset 1, with accuracy of 77.19% and F1-score of 63.11%. Random Forest performed the best in Dataset 2, with accuracy of 93.6% and F1-score of 77.20%. Random Forest performed the best in Dataset 3 in terms of accuracy, at 63.09%, while Multi-layer Perceptron performed the best in terms of F1-score, at 42.84%. After implementing churn prediction, Bayesian Logistic Regression is used to conduct the factor analysis and to figure out some important features for churn customer segmentation. Churn customer segmentation is then carried out using K-means clustering. Customers are segmented into different groups, which allows marketers and decision makers to adopt retention strategies more precisely.
Keywords