JLIS.it (May 2024)

Towards a semi-automatic classifier of malware through tweets for early warning threat detection

  • Claudia Lanza,
  • Lorenzo Lodi

DOI
https://doi.org/10.36253/jlis.it-591
Journal volume & issue
Vol. 15, no. 2

Abstract

Read online

This paper presents a method for developing a malware ontology structure by detecting malware instances on Twitter. The ontology represents a semi-automatic classifier fed by the data extracted from tweets. In particular, the automatic part of the presented methodology relies on a pattern-based approach to detect trigger expressions leading to new information about malware, whilst the manual one covers the evaluation of the results by domain-experts, who also validate the reliability of the semantic relationships within the ontology framework. We present preliminary results on the application of our methodology to tweets extracted from MalwareBazaar database showing how the documents’ collection analysis, through Natural Language Processing (NLP) tasks, can support the knowledge retrieval and documents’ classification procedures for building early warning system of detected malware. Results obtained from this research paper within the time framework of 2023 are referred to the previous version of the current social network X.

Keywords