Naučno-tehničeskij Vestnik Informacionnyh Tehnologij, Mehaniki i Optiki (Oct 2024)

Obfuscated malware detection using deep neural network with ANOVA feature selection on CIC-MalMem-2022 dataset

  • Mourad Hadjila,
  • Mohammed Merzoug,
  • Wafaa Ferhi,
  • Djillali Moussaoui,
  • Al Baraa Bouidaine,
  • Mohammed Hicham Hachemi

DOI
https://doi.org/10.17586/2226-1494-2024-24-5-849-857
Journal volume & issue
Vol. 24, no. 5
pp. 849 – 857

Abstract

Read online

Malware analysis is the process of dissecting malicious software to understand its functionality, behavior, and potential risks. Artificial Intelligence (AI) and deep learning are ushering in a new era of automated, intelligent, and adaptive malware analysis. This convergence of AI and deep learning promises to revolutionize the way cybersecurity professionals detect, analyze and respond to malware threats. This paper proposed a Deep Neural Network (DNN) model built from features selected by ANalysis Of Variance (ANOVA) F-test (DNN-ANOVA) to increase accuracy by identifying informative features. ANOVA is a feature selection method used for numerical input data when the target variable is categorical. The top k most relevant features are those whose score values are greater than a certain threshold equal to the ratio between the sum of all features scores and the total number of features. Experiments are conducted on CIC-MalMem-2022 dataset. Malware Analysis is performed using binary classification to detect the presence or absence of malware and multiclass classification to detect not only the malware but also its type. According to the test results, DNN-ANOVA model achieves best values of 100 %, 99.99 %, 99.99 %, and 99.98 % in terms of precision, accuracy, F1-score and recall respectively for binary classification. In addition, DNN-ANOVA outperforms the current works with an overall accuracy rate of 85.83 %, and 73.98 % for family attacks and individual attacks respectively in the case of multiclass classification.

Keywords