IEEE Access (Jan 2024)

Korean Voice Phishing Detection Applying NER With Key Tags and Sentence-Level N-Gram

  • Seunguk Yu,
  • Yejin Kwon,
  • Minju Kim,
  • Kiseong Lee

DOI
https://doi.org/10.1109/ACCESS.2024.3387027
Journal volume & issue
Vol. 12
pp. 52951 – 52962

Abstract

Read online

Voice phishing is the criminal act of tricking others to transfer funds or to seek financial gain based on personal information obtained illegally. The importance of this crime is recognized worldwide, and technical solutions have been proposed to reduce the increasing damage. In this paper, we propose a process for voice phishing detection in Korean by applying named entity recognition (NER) with Key Tags and Sentence-level N-gram. From the perspective of humans, we collect financial counseling texts as non-phishing dataset since the victim confuses voice phishing with them. We carefully select Key Tags that can be meaningful for distinguishing voice phishing and financial counseling texts and combine sentence bundles to effectively detect voice phishing. The experimental results, using ten types of machine learning models, showed that maintained results when generalizing information by Key Tags and improved results when combining text bundles. We hope that the proposed process can be effectively applied to other criminal scenarios in the future.

Keywords