Ensemble Learning based Optimized Random over Sampling for Handling Class Imbalance of Customer Churn Prediction in Telecommunication
Keywords:
Ensemble Learning (EL), Big Data (BD), Machine Learning (ML)Abstract
This is the era of big data, which is more than a couple of decades old. In Telecommunication, it spawned a new requirement to dig deep into the imbalanced data of customers at risk of churn to gain deep insights, so a company may become able to turn the data into dollars by retaining their existing customers. But even after more than two decades of Machine Learning (ML) Model development through Big Data (BD) analytics have been passed on, the issue of class imbalance in data is still intensely grasping the focus of research. In this paper, a novice Optimized Random Oversampling Technique (OROT) has been presented to handle the data class imbalance issue and to improve the accuracy of Ensemble Learning (EL) model. The experiment has been conducted over the Cell2Cell, an open sourced, dataset with 58 features and 51047 instances. It is concluded that Ensemble Learning based Optimized Random Oversampling Technique (ELOROT) can significantly contribute in Telecommunication for Customer Churn Prediction and Retention (CCPR) by addressing the issue of data class imbalance.