Systems Science & Control Engineering (Dec 2025)
Supervised methods of machine learning for email classification: a literature survey
Abstract
In today’s digital landscape, email is acknowledged as a critical conduit for global data exchanges. With a surge in data volume, malefactors exploit user identities, leading to data misuse. Cybercriminals employ electronic transgressions such as phishing and spam to orchestrate security infractions. Machine learning counters these breaches using myriad techniques, demonstrating significant efficiency in identifying phishing emails. We can divide machine learning into two types: supervised and unsupervised. Supervised learning requires pre-training the model on labelled datasets, amalgamating classification, and regression learning. Notably, supervised methodologies such as support vector machines (SVMs), naive Bayes, decision trees, neural networks, random forests, and deep learning have been exploited for spam filtering. This review delves into issues concerning spam filtering and email classification through supervised machine learning techniques, offering a comprehensive evaluation of strategies, methods, performance indicators, and the benefits and drawbacks of different research. This information allows researchers to assess the efficiency and effectiveness of supervised learning algorithms, laying the foundation for advanced email categorization techniques.
Keywords