IEEE Access (Jan 2020)

Multi-Loss Siamese Neural Network With Batch Normalization Layer for Malware Detection

  • Jinting Zhu,
  • Julian Jang-Jaccard,
  • Paul A. Watters

DOI
https://doi.org/10.1109/ACCESS.2020.3024991
Journal volume & issue
Vol. 8
pp. 171542 – 171550

Abstract

Read online

Malware detection is an essential task in cyber security. As the trend of malicious attacks grows, unknown malware detection with high accuracy becomes more and more challenging. The current deep learning-based approaches for malware detection are typically trained with large amounts of samples using labeled and existing malware families in the training set, thus, their capability to detect new unseen malware (such as a zero-day attack) is limited. To address this issue, we propose a new one-shot model called “Multi-Loss Siamese Neural Network with Batch Normalization Layer” that can work with fewer samples while providing high detection accuracy. Our model utilizes the Siamese Neural Network to detect new variants of malware that is trained with only a few samples. Our model is equipped with batch normalization and multiple loss functions to address the overfitting issue, due to the use of small samples, that can create the vanishing gradient problem as a result of binary cross-entropy loss, and feature embedding space to improve the detection accuracy. In addition, we illustrate a way to convert raw binary files into malware gray scale images, to work with the popular Siamese Neural Network by generating the positive and negative pairs for training. Our experimental results show that our model outperforms existing similar methods.

Keywords