Healthcare insurance fraud detection using data mining

Zain Hamid; Fatima Khalique; Saba Mahmood; Ali Daud; Amal Bukhari; Bader Alshemaimri

doi:10.1186/s12911-024-02512-4

BMC Medical Informatics and Decision Making (Apr 2024)

Healthcare insurance fraud detection using data mining

Zain Hamid,
Fatima Khalique,
Saba Mahmood,
Ali Daud,
Amal Bukhari,
Bader Alshemaimri

Affiliations

Zain Hamid: Department of Computer Science, Bahria University
Fatima Khalique: Department of Computer Science, Bahria University
Saba Mahmood: Department of Computer Science, Bahria University
Ali Daud: Faculty of Resilience, Rabdan Academy
Amal Bukhari: Department of Information Systems and Technology, College of Computer Science and Engineering, University of Jeddah
Bader Alshemaimri: Software Engineering Department, College of Computing and Information Sciences, King Saud University

DOI: https://doi.org/10.1186/s12911-024-02512-4
Journal volume & issue: Vol. 24, no. 1
pp. 1 – 24

Abstract

Read online

Abstract Background Healthcare programs and insurance initiatives play a crucial role in ensuring that people have access to medical care. There are many benefits of healthcare insurance programs but fraud in healthcare continues to be a significant challenge in the insurance industry. Healthcare insurance fraud detection faces challenges from evolving and sophisticated fraud schemes that adapt to detection methods. Analyzing extensive healthcare data is hindered by complexity, data quality issues, and the need for real-time detection, while privacy concerns and false positives pose additional hurdles. The lack of standardization in coding and limited resources further complicate efforts to address fraudulent activities effectively. Methodolgy In this study, a fraud detection methodology is presented that utilizes association rule mining augmented with unsupervised learning techniques to detect healthcare insurance fraud. Dataset from the Centres for Medicare and Medicaid Services (CMS) 2008-2010 DE-SynPUF is used for analysis. The proposed methodology works in two stages. First, association rule mining is used to extract frequent rules from the transactions based on patient, service and service provider features. Second, the extracted rules are passed to unsupervised classifiers, such as IF, CBLOF, ECOD, and OCSVM, to identify fraudulent activity. Results Descriptive analysis shows patterns and trends in the data revealing interesting relationship among diagnosis codes, procedure codes and the physicians. The baseline anomaly detection algorithms generated results in 902.24 seconds. Another experiment retrieved frequent rules using association rule mining with apriori algorithm combined with unsupervised techniques in 868.18 seconds. The silhouette scoring method calculated the efficacy of four different anomaly detection techniques showing CBLOF with highest score of 0.114 followed by isolation forest with the score of 0.103. The ECOD and OCSVM techniques have lower scores of 0.063 and 0.060, respectively. Conclusion The proposed methodology enhances healthcare insurance fraud detection by using association rule mining for pattern discovery and unsupervised classifiers for effective anomaly detection.

Published in BMC Medical Informatics and Decision Making

ISSN: 1472-6947 (Online)
Publisher: BMC
Country of publisher: United Kingdom
LCC subjects: Medicine: Medicine (General): Computer applications to medicine. Medical informatics
Website: http://bmcmedinformdecismak.biomedcentral.com

About the journal

Abstract

Keywords