ChatPhishDetector: Detecting Phishing Sites Using Large Language Models

Takashi Koide; Hiroki Nakano; Daiki Chiba

doi:10.1109/ACCESS.2024.3483905

IEEE Access (Jan 2024)

ChatPhishDetector: Detecting Phishing Sites Using Large Language Models

Takashi Koide,
Hiroki Nakano,
Daiki Chiba

Affiliations

Takashi Koide: ORCiD; NTT Security Holdings Corporation & NTT Corporation, Tokyo, Japan
Hiroki Nakano: ORCiD; NTT Security Holdings Corporation & NTT Corporation, Tokyo, Japan
Daiki Chiba: ORCiD; NTT Security Holdings Corporation & NTT Corporation, Tokyo, Japan

DOI: https://doi.org/10.1109/ACCESS.2024.3483905
Journal volume & issue: Vol. 12
pp. 154381 – 154400

Abstract

Read online

Large Language Models (LLMs), such as ChatGPT, are significantly impacting various fields. While LLMs have been extensively studied for code generation and text synthesis, their application in detecting malicious web content, particularly phishing sites, remains largely unexplored. To counter the increasing cyber-attacks that leverage LLMs for creating more sophisticated and convincing phishing content, it is crucial to automate detection by harnessing LLMs’ advanced capabilities. This paper introduces ChatPhishDetector, a novel system that employs LLMs to identify phishing sites. Our approach involves using a web crawler to collect website information, generating prompts for LLMs based on the gathered data, and extracting detection results from LLM responses. This system enables accurate detection of multilingual phishing sites by identifying impersonated brands and social engineering techniques within the entire website context, without requiring machine learning model training. We evaluated our system’s performance using our own dataset and compared it with baseline systems and several LLMs. Experiments using GPT-4V showed exceptional results, achieving 98.7% precision and 99.6% recall, surpassing the detection performance of other LLMs and existing systems. These findings highlight the potential of LLMs for protecting users from online fraudulent activities and provide crucial insights for strengthening defenses against phishing attacks.

Published in IEEE Access

ISSN: 2169-3536 (Online)
Publisher: IEEE
Country of publisher: United States
LCC subjects: Technology: Electrical engineering. Electronics. Nuclear engineering
Website: https://ieeexplore.ieee.org/xpl/RecentIssue.jsp?punumber=6287639

About the journal

Abstract

Keywords