Scientific African (Jun 2024)
Sampling-based novel heterogeneous multi-layer stacking ensemble method for telecom customer churn prediction
Abstract
In recent times, customer churn has become one of the most significant issues in business-oriented sectors with telecommunication being no exception. Maintaining current customers is particularly valuable due to the high degree of rivalry among telecommunication companies and the costs of acquiring new ones. The early prediction of churned customers may help telecommunication companies to identify the causes of churn and design industrial tactics to address or mitigate the churn problem. Controlling customer churn by developing efficient and reliable customer churn prediction (CCP) solutions is essential to achieving this objective. Findings from existing CCP studies have shown that numerous methods, such as rule-based and machine-learning (ML) mechanisms, have been devised to solve the CCP problem. Nonetheless, the problems of adaptability and the resilience of rule-based CCP solutions are its major weaknesses, and the skewed pattern of churn datasets (class imbalance) is detrimental to the prediction performances of conventional ML models in CCP. Hence, this research developed a robust heterogeneous multi-layer stacking ensemble method (HMSE) for effective CCP. Specifically, in the HMSE method, the prediction prowess of five ML classifiers (Random Forest (RF), Bayesian network (BN), Support Vector Machine (SVM), K-Nearest Neighbour (KNN), and Repeated Incremental Pruning to Produce Error Reduction (RIPPER)) with distinct computational characteristics are ensembled based on stacking and the resulting model is further enhanced using a forest penalizing attribute (FPA) model. The synthetic minority oversampling technique (SMOTE) is integrated with the proposed HMSE to balance the skewed class label present in the original experimental datasets. Extensive tests were carried out to determine the effectiveness of the proposed HMSE and S-HMSE on standard telecom CCP datasets. Observed findings from the experimental results showed that HMSE and S-HMSE can effectively predict churners even with the class imbalance (skewed datasets) problem. In addition, comparison studies demonstrated that the suggested S-HMSE offered improved prediction performance and optimum solutions for CCP in the telecom sector in comparison with baseline classifiers, homogeneous ensemble methods, and current CCP approaches.