Alexandria Engineering Journal (Sep 2024)

A hybrid NLP and domain validation technique for disposable email detection

  • Rayan Alanazi,
  • Saad Alanazi

Journal volume & issue
Vol. 102
pp. 200 – 210

Abstract

Read online

Disposable email address services have gained popularity recently, offering users a way to register for online services without revealing their primary email. However, this poses challenges for organizations aiming to engage with genuine users. This paper introduces a novel approach combining natural language processing (NLP) and domain validation to identify disposable email addresses. The technique employs various machine learning methods, including Support Vector Classifier (SVC), Multinomial Naive Bayes (MNB), Gaussian Naive Bayes (GNB), Logistic Regression, XGBoost Classifier, K Neighbors Classifier, Random Forest Classifier, Linear, and Discriminant Analysis. Unlike traditional methods relying on blacklists, this technique achieves a 97% accuracy rate by effectively detecting and classifying new disposable emails.

Keywords