Journal of Information Systems and Informatics (Mar 2025)

Hybrid Unsupervised Machine Learning for Insurance Fraud Detection: PCA-XGBoost-LOF and Isolation Forest

  • Natsai Chapwanya,
  • Karikoga Norman Gorejena

DOI
https://doi.org/10.51519/journalisi.v7i1.958
Journal volume & issue
Vol. 7, no. 1
pp. 941 – 959

Abstract

Read online

Insurance fraud poses a significant threat to the financial stability of insurance companies, resulting in substantial economic losses. To combat this issue, this study proposes a novel unsupervised machine learning hybrid algorithm, integrating Principal Component Analysis (PCA), Extreme Gradient Boosting (XGBoost), Local Outlier Factor (LOF), and Isolation Forest. This hybrid approach aims to improve the detection accuracy of insurance fraud by combining the strengths of each individual algorithm. Experimental results a real-world insurance dataset demonstrate a detection accuracy of 92%, precision of 92% and recall of 96%. Our experimental results demonstrate that the proposed hybrid algorithm outperforms existing state-of-the-art methods, achieving a higher detection rate and reducing false positives. This research contributes to the development of effective insurance fraud detection systems, ultimately helping insurance companies to minimize financial losses and improve their overall profitability.

Keywords