Journal of Digital Forensics, Security and Law (Mar 2010)

Clustering Spam Domains and Destination Websites: Digital Forensics with Data Mining

  • Chun Wei,
  • Alan Sprague,
  • Gary Warner,
  • Anthony Skjellum

Journal volume & issue
Vol. 5, no. 1
pp. 21 – 48

Abstract

Read online

Spam related cyber crimes have become a serious threat to society. Current spam research mainly aims to detect spam more effectively. We believe the prosecution of spammers is a more effective way of stopping spam emails than filtering, therefore more research is needed to help forensic investigators to collect useful evidence. This research proposes an algorithm for clustering spam domains extracted from spam emails based on the hosting IP addresses and tracing the domains over a period of time. The results reveal several facts that merit law enforcement attention: many seemingly unrelated spam campaigns are actually related; spammers have a sophisticated mechanism for combating URL blacklisting by registering many new domain names every day and flushing out old domains; the domains are hosted at different IP addresses across several networks, mostly in China where legislation is not as tight as in US; old IP addresses are replaced by new ones from time to time, but still show strong correlation among them. These facts lead to the conclusion that spam-related cyber crimes are operated by well-organized criminal syndicates that have sufficient manpower to distribute a huge volume of spam through bots, purchase a large number of domain names and hosting servers and maintain websites to sell counterfeit products online. Traditional law enforcements technology has not scaled well in cases involving millions of data elements. This paper demonstrates an effective use of data mining to respond to this challenge.