Cogent Engineering (Dec 2024)

An analysis on classification models for customer churn prediction

  • Kathi Chandra Mouli,
  • Ch. V. Raghavendran,
  • V. Y. Bharadwaj,
  • G. Y. Vybhavi,
  • C. Sravani,
  • Khristina Maksudovna Vafaeva,
  • Rajesh Deorari,
  • Laith Hussein

DOI
https://doi.org/10.1080/23311916.2024.2378877
Journal volume & issue
Vol. 11, no. 1

Abstract

Read online

The rapid expansion of technical infrastructure has brought about transformative changes in business operations. A notable consequence of this digital evolution is the proliferation of subscription-based services. With an increasing array of options for goods and services, customer churn has emerged as a significant challenge, posing a threat to businesses across sectors. The direct impact on earnings has prompted businesses to proactively develop tools for predicting potential client turnover. Identifying the underlying factors contributing to churn is crucial for implementing effective retention strategies. Our research makes a pivotal contribution by presenting a churn prediction model designed to assist businesses in identifying clients at risk of churn. The proposed model leverages machine learning classification techniques, with the customer data undergoing thorough pre-processing phases prior to model application. We systematically evaluated ten classification techniques, including Logistic Regression, Support Vector Classifier, Kernel SVM, KNN, Gaussian Naïve Bayes, Decision Tree Classifier, Random Forest, ADA Boost, XGBoost, and Gradient Boost. The assessment encompassed various evaluation metrics, such as ROC AUC Mean, ROC AUC STD, Accuracy Mean, Accuracy STD, Accuracy, Precision, Recall, F1 Score, and F2 Score. Employing 10-fold cross-validation and hyper parameter tuning through GridSearchCV and RandomizedSearchCV, we identified Random Forest as the most effective classifier, achieving an 85% Area Under the Curve (AUC) for optimal results.

Keywords