IEEE Access (Jan 2024)

Hybrid Input Model Using Multiple Features From Surface Analysis for Malware Detection

  • Mamoru Mimura,
  • Satoki Kanno

DOI
https://doi.org/10.1109/ACCESS.2024.3452675
Journal volume & issue
Vol. 12
pp. 121198 – 121207

Abstract

Read online

Many malware detection models have been proposed to protect computers from the ever- increasing number of malware attacks. The features that are obtained from surface analysis and machine learning are often used for malware detection. Previous studies that performed surface analysis have proposed image-based methods using ensemble learning. However, no natural language processing (NLP)-based malware detection method that combines multiple features has yet been reported. Instead, previous malware detection methods using NLP techniques have focused only on single features. When hybrid features are used, the word order and detection rate is affected if the data are initially handled by combining the hybrid features into one data point. Consequently, using NLP techniques is challenging when considering the word order. This paper proposes a hybrid model that uses three hybrid features obtained from surface analysis for malware detection and demonstrates the effectiveness of using NLP techniques in combination with hybrid features. The F-measure for the combination of these three features was 0.927.

Keywords