BenchCouncil Transactions on Benchmarks, Standards and Evaluations (Feb 2023)

ERMDS: A obfuscation dataset for evaluating robustness of learning-based malware detection system

  • Lichen Jia,
  • Yang Yang,
  • Bowen Tang,
  • Zihan Jiang

Journal volume & issue
Vol. 3, no. 1
p. 100106

Abstract

Read online

Learning-based malware detection systems (LB-MDS) play a crucial role in defending computer systems from malicious attacks. Nevertheless, these systems can be vulnerable to various attacks, which can have significant consequences. Software obfuscation techniques can be used to modify the features of malware, thereby avoiding its classification as malicious by LB-MDS. However, existing portable executable (PE) malware datasets primarily use a single obfuscation technique, which LB-MDS has already learned, leading to a loss of their robustness evaluation ability. Therefore, creating a dataset with diverse features that were not observed during LB-MDS training has become the main challenge in evaluating the robustness of LB-MDS.We propose a obfuscation dataset ERMDS that solves the problem of evaluating the robustness of LB-MDS by generating malwares with diverse features. When designing this dataset, we created three types of obfuscation spaces, corresponding to binary obfuscation, source code obfuscation, and packing obfuscation. Each obfuscation space has multiple obfuscation techniques, each with different parameters. The obfuscation techniques in these three obfuscation spaces can be used in combination and can be reused. This enables us to theoretically obtain an infinite number of obfuscation combinations, thereby creating malwares with a diverse range of features that have not been captured by LB-MDS.To assess the effectiveness of the ERMDS obfuscation dataset, we create an instance of the obfuscation dataset called ERMDS-X. By utilizing this dataset, we conducted an evaluation of the robustness of two LB-MDS models, namely MalConv and EMBER, as well as six commercial antivirus software products, which are anonymized as AV1-AV6. The results of our experiments showed that ERMDS-X effectively reveals the limitations in the robustness of existing LB-MDS models, leading to an average accuracy reduction of 20% in LB-MDS and 32% in commercial antivirus software. We conducted a comprehensive analysis of the factors that contributed to the observed accuracy decline in both LB-MDS and commercial antivirus software. We have released the ERMDS-X dataset as an open-source resource, available on GitHub at https://github.com/lcjia94/ERMDS.

Keywords