Intelligent Systems with Applications (Nov 2023)
Exploring topic models to discern cyber threats on Twitter: A case study on Log4Shell
Gathering information about cyber threats from various sources can help organisations improve proactive cyber defense and mitigate potential cyber attacks. Recently, Twitter has shown to be beneficial in providing timely Cyber Threat Intelligence (CTI) concerning cyber threats, software vulnerabilities and exploits. However, manually identifying and investigating useful insights, patterns, and trends from abundant unstructured tweets is difficult. This work proposes an end-to-end data-driven framework to collect, analyze, and monitor tweets using unsupervised topic modeling techniques. A novel visualization technique is also proposed to monitor the dynamic topic trends over time, offering an interpretable way to gain insights into a topic's lifecycle. A case study is conducted on the Log4shell vulnerability incident to demonstrate the applicability of the proposed framework. Experiments are carried out on a real-world Twitter dataset collected from 47 users within the CTI community. Results indicate that the proposed framework can discover emerging topics relevant to real-world cybersecurity incidents, with Log4Shell-related topics identified before the common public disclosure date by the National Vulnerability Database (NVD). This framework can expedite the data processing workflow and visualization for cyber threat analysis, enabling organizations to identify trends and patterns that can potentially indicate a security breach or attack.