Analisis Pola Dan Prediksi Churn: Hybrid Segmentasi SOM+K-Means Dan Klasifikasi Machine Learning
DOI:
https://doi.org/10.30865/jurikom.v13i1.9409Keywords:
Bank Churn, Customer Segmentation, Hybrid Clustering, Churn Prediction, Machine LearningAbstract
Customer churn is a significant challenge in the banking industry, which can have a substantial impact on the profitability and long-term sustainability. Customer churn management is typically addressed using binary classification approaches, which often fail to provide the depth needed to understand customer characteristics. This study proposes addressing churn through customer segmentation as an preliminary step before classification, offering a clearer and deeper understanding of each segment’s characteristics. The research combines Self-Organizing Map (SOM) and K-Means clustering to create interpretable segments. The SOM+K-Means model is used for segmentation and visual mapping, which helps identify customer groups at risk of churn and the key features influencing these risks. Cluster labels are then used as features for classification using three machine learning algorithms: Support Vector Machine (SVM), Random Forest (RF), and XGBoost (XGB). In the classification phase, the Synthetic Minority Oversampling Technique (SMOTE) and GridSearchCV are applied to address class imbalance and optimize model parameters. XGB outperformed the other models with an accuracy of 85% and an AUC score of 85%. These results highlight that customer segmentation with SOM+K-Means enables more effective churn management strategies, while XGB proves to be a strong model for churn prediction. This research contributes to the application of clustering and machine learning classification techniques in churn analysis within the banking industry, offering a pathway to better customer retention strategies and lower churn rates.
References
[1] A. K. Ahmad, A. Jafar, and K. Aljoumaa, “Customer churn prediction in telecom using machine learning in big data platform,” J Big Data, vol. 6, no. 1, p. 28, 2019, doi: 10.1186/s40537-019-0191-6.
[2] Y. Ortakci and H. Seker, “Optimising customer retention: An AI-driven personalised pricing approach,” Comput Ind Eng, vol. 188, p. 109920, 2024, doi: https://doi.org/10.1016/j.cie.2024.109920.
[3] S. S. Poudel, S. Pokharel, and M. Timilsina, “Explaining customer churn prediction in telecom industry using tabular machine learning models,” Machine Learning with Applications, vol. 17, p. 100567, 2024, doi: https://doi.org/10.1016/j.mlwa.2024.100567.
[4] S. Wu, W.-C. Yau, T.-S. Ong, and S.-C. Chong, “Integrated Churn Prediction and Customer Segmentation Framework for Telco Business,” IEEE Access, vol. 9, pp. 62118–62136, 2021, doi: 10.1109/ACCESS.2021.3073776.
[5] A. A. Jamjoom, “The use of knowledge extraction in predicting customer churn in B2B,” J Big Data, vol. 8, no. 1, p. 110, 2021, doi: 10.1186/s40537-021-00500-3.
[6] S. Sundram, D. D. Poornima, D. Praveenkumar, M. C. Balakumar, D. D. Sasikala, and S. Omonov, “A Novel Stochastic Gradient Descent Based Logistic Regression (SGD-LR) Framework for Customer Churn Prediction,” International Journal of Intelligent Systems and Applications in Engineering IJISAE, vol. 12, no. 17s, pp. 754–765, Feb. 2024.
[7] T. Zhang, S. Moro, and R. F. Ramos, “A Data-Driven Approach to Improve Customer Churn Prediction Based on Telecom Customer Segmentation,” Future Internet, vol. 14, no. 3, 2022, doi: 10.3390/fi14030094.
[8] S. Höppner, E. Stripling, B. Baesens, S. vanden Broucke, and T. Verdonck, “Profit driven decision trees for churn prediction,” Eur J Oper Res, vol. 284, no. 3, pp. 920–933, 2020, doi: https://doi.org/10.1016/j.ejor.2018.11.072.
[9] M. Kiguchi, W. Saeed, and I. Medi, “Churn prediction in digital game-based learning using data mining techniques: Logistic regression, decision tree, and random forest,” Appl Soft Comput, vol. 118, p. 108491, 2022, doi: https://doi.org/10.1016/j.asoc.2022.108491.
[10] C. Wang, D. Han, W. Fan, and Q. Liu, “Customer Churn Prediction with Feature Embedded Convolutional Neural Network: An Empirical Study in the Internet Funds Industry,” Int J Comput Intell Appl, vol. 18, no. 01, p. 1950003, Mar. 2019, doi: 10.1142/S1469026819500032.
[11] A. Hermawan, W. Wijaya, and B. Daniawan, “Optimizing Artificial Neural Network for Customer Churn: Advanced Data Balancing and Feature Selection,” International Journal on Informatics Visualization, vol. 9, no. 3, pp. 1132–1141, May 2025, doi: 10.62527/joiv.9.3.3064.
[12] Seema and G. Gupta, “Development of fading channel patch based convolutional neural network models for customer churn prediction,” International Journal of System Assurance Engineering and Management, vol. 15, no. 1, pp. 391–411, 2024, doi: 10.1007/s13198-022-01759-2.
[13] S. Ouf, K. T. Mahmoud, and M. A. Abdel-Fattah, “A proposed hybrid framework to improve the accuracy of customer churn prediction in telecom industry,” J Big Data, vol. 11, no. 1, Dec. 2024, doi: 10.1186/s40537-024-00922-9.
[14] F. E. Usman-Hamza et al., “Sampling-based novel heterogeneous multi-layer stacking ensemble method for telecom customer churn prediction,” Sci Afr, vol. 24, Jun. 2024, doi: 10.1016/j.sciaf.2024.e02223.



