Journal of Big Data (Dec 2024)

Enhancing credit card fraud detection: highly imbalanced data case

  • Dalia Breskuvienė,
  • Gintautas Dzemyda

DOI
https://doi.org/10.1186/s40537-024-01059-5
Journal volume & issue
Vol. 11, no. 1
pp. 1 – 24

Abstract

Read online

Abstract In the contemporary landscape, fraud is a widespread challenge in today’s financial landscape, requiring innovative methods and technologies to detect and prevent losses from the sophisticated tactics used by fraudsters. This paper emphasizes the main issues in fraud detection and suggests a novel feature selection method called FID-SOM (feature selection for imbalanced data using SOM). Feature selection can significantly improve classification performance. Given the inherent imbalance in fraud detection data, feature selection must be done with an enhanced focus. To accomplish this task, we use Self-Organizing maps, which are a special type of artificial neural network. FID-SOM is designed to address the challenge of dimensionality reduction in scenarios characterized by highly imbalanced data. It has been specifically designed to efficiently process and analyze vast and complex datasets commonly encountered in the financial sector, showcasing adaptability to the dynamic nature of big data environments. The uniqueness of the proposed method is in forming a new dataset containing the Best-Matching Units of the trained SOM as vectors of attributes corresponding to the initial features. These attributes are sorted based on variance in descending order. By keeping the required number of attributes that hold the highest percentage of variability, we select features corresponding to those attributes for further analysis. The proposed FID-SOM method has demonstrated its ability to perform on par with, if not surpass, existing methodologies. It also shows innovative potential.

Keywords