Journal of Big Data (Nov 2024)
An integrated multistage ensemble machine learning model for fraudulent transaction detection
Abstract
Abstract Fraudulent transactions continue to pose a concern for financial institutions and organizations, necessitating the development of effective detection tools. Identification and prevention of fraudulent transactions depend heavily on the detection of credit card fraud. Even though instances of credit card fraud are uncommon, they can nonetheless cause significant financial losses because of the high cost of fraudulent transactions. When fraud is discovered early on, investigators can act quickly to stop additional losses. But because the investigation process takes a while, there are only so many warnings that can be looked through in detail in a given day. Thus, a fraud detection model’s main goal is to minimize false alarms and missed fraud situations while producing accurate alerts. To improve fraud identification, we provide in this study an integrated multistage ensemble Machine Learning (IMEML) model that incorporates various multistage ensemble models intelligently, such as Ensemble Independent Classifier (EIC), Ensemble Bagging Classifier (EBC), and Ensemble ML Classifier (EMC). In order to overcome the problem of data imbalance, we use a number of methods-including Instant Hardness Threshold with EMC (IHT+EMC), Cluster Centroids (CC), and Randon Under Sampler (RUS)-that go beyond traditional methods. We run our studies on a 284,807-transaction credit card dataset that is made available to the public. The accuracy rates of 99.94%, 99.91%, 99.14%, 99.52%, and perfect 100% for accuracy, precision, recall, f1-score, and AUC score, respectively, are achieved by the suggested model, demonstrating remarkable performance scores. For real-world fraud detection applications, the EIBMC model sets a new benchmark for identifying fraudulent transactions in high-frequency scenarios by outperforming cutting-edge techniques.
Keywords