Statistika: Statistics and Economy Journal (Jun 2019)
Use of Logistic Regression for Understanding and Prediction of Customer Churn in Telecommunications
Abstract
Customer churn, loss of customers due to switch to another service provider or non-renewal of commitment, is very common in highly competitive and saturated markets such as telecommunications. Predictive models need to be implemented to identify customers who are at risk of churning and also to discover the key drivers of churn. The aim of this paper is to use demographic and service usage variables to estimate logistic regression model to predict customer churn in European Telecommunications provider and to find the factors influencing customer churn. An interesting findings came out of the estimated model – younger customers who are shorter time with company, who use mobile data and sms more than traditional calls, having occasional problem with paying bills, with students account and ending contract in the near future are typical representatives of customers who tend to leave the company. An interaction terms added as explanatory variables showed that effect of usage of data and voice vary depending on the year of birth. The quality of the logistic regression model was assessed by Hosmer-Lemeshow test and pseudo R squared measures. An independent testing data set was further used to evaluate the predictive ability of the model by computation of performance metrics such as the area under the ROC curve (AUC), sensitivity and precision. The resulting model was able to catch 94.8% of customers who in fact left the company. Quality of the model was confirmed also by high value of AUC metric equal to 0.9759. Logistic regression represents a very useful tool in prediction of customer churn not only thanks to its interpretability, but also for its predictive power.