IEEE Access (Jan 2024)

Navigating the Shadows: Manual and Semi-Automated Evaluation of the Dark Web for Cyber Threat Intelligence

  • Philipp Kuhn,
  • Kyra Wittorf,
  • Christian Reuter

DOI
https://doi.org/10.1109/ACCESS.2024.3448247
Journal volume & issue
Vol. 12
pp. 118903 – 118922

Abstract

Read online

In today’s world, cyber-attacks are becoming more frequent and thus proactive protection against them is becoming more important. Cyber Threat Intelligence (CTI) is a possible solution, as it collects threat information in various information sources and derives stakeholder intelligence to protect one’s infrastructure. The current focus of CTI in research is the clear web, but the dark web may contain further information. To further advance protection, this work analyzes the dark web as Open Source Intelligence (OSINT) data source to complement current CTI information. The underlying assumption is that hackers use the dark web to exchange, develop, and share information and assets. This work aims to understand the structure of the dark web and identify the amount of its openly available CTI related information. We conducted a comprehensive literature review for dark web research and CTI. To follow this up we manually investigated and analyzed 65 dark web forum (DWF), 7 single-vendor shops, and 72 dark web marketplace (DWM). We documented the content and relevance of DWFs and DWMs for CTI, as well as challenges during the extraction and provide mitigations. During our investigation we identified IT security relevant information in both DWFs and DWMs, ranging from malware toolboxes to hacking-as-a-service. One of the most present challenges during our manual analysis were necessary interactions to access information and anti-crawling measures, i.e., CAPTCHAs. This analysis showed 88% of marketplaces and 53% of forums contained relevant data. Our complementary semi-automated analysis of 1 186 906 onion addresses indicates, that the necessary interaction makes it difficult to see the dark web as an open, but rather treat it as specialized information source, when clear web information does not suffice.

Keywords