Crime Science (Jun 2025)
Applications of AI-Based Models for Online Fraud Detection and Analysis
Abstract
Abstract Background Fraud is a prevalent offence that extends beyond financial loss, impacting victims emotionally, psychologically, and physically. Advances in online communication technologies continue to create new opportunities for fraud, and fraudsters increasingly using these channels for deception. With the progression of technologies like Generative Artificial Intelligence (GenAI), there is a growing concern that fraud will increase in scale using these advanced methods, with offenders employing deep-fakes in phishing campaigns, for example. However, the application of AI, particularly Natural Language Processing (NLP), to detect and analyse patterns of online fraud remains understudied. This review addresses this gap by investigating the potential role of AI in analysing online fraud using text data. Methods We conducted a Systematic Literature Review (SLR) to investigate the application of AI and Natural Language Processing (NLP) techniques for online fraud detection. The review adhered to the PRISMA-ScR protocol, with eligibility criteria including language, publication type, relevance to online fraud, use of text data, and AI methodologies. Out of 2457 academic records screened, 350 met our eligibility criteria, and 223 were analysed and included herein. Results We discuss the state-of-the-art AI and NLP techniques used to analyse various online fraud categories; the data sources used for training the AI and NLP models; the AI and NLP algorithms and models built; and the performance metrics employed for model evaluation. We find that the current state of research on online fraud is broken into the various scam activities that take place, and more specifically, we identify 16 different frauds that researchers focus on. Finally, we present the most recent and best-performing AI methods employed for detecting online scams and fraud activities. Conclusions This SLR enhances academic understanding of AI-based detection methods for online fraud and offers insights for policymakers, law enforcement, and businesses on safeguarding against such activities. We conclude that existing approaches focusing on specific scams are unlikely to generalise effectively, as they will require new models to be developed for each fraud type. Furthermore, we conclude that the evolving nature of scams limits the effectiveness of models trained on outdated data. We also identify that researchers often omit discussions of the limitations of their data or training biases. Finally, we find issues in the consistency with which the performance of models is reported, with some studies selectively presenting metrics, leading to potential biases in model evaluation.
Keywords