Similarity-Based Hybrid Malware Detection Model Using API Calls

Asma A. Alhashmi; Abdulbasit A. Darem; Abdullah M. Alashjaee; Sultan M. Alanazi; Tareq M. Alkhaldi; Shouki A. Ebad; Fuad A. Ghaleb; Aloyoun M. Almadani

doi:10.3390/math11132944

Mathematics (Jun 2023)

Similarity-Based Hybrid Malware Detection Model Using API Calls

Asma A. Alhashmi,
Abdulbasit A. Darem,
Abdullah M. Alashjaee,
Sultan M. Alanazi,
Tareq M. Alkhaldi,
Shouki A. Ebad,
Fuad A. Ghaleb,
Aloyoun M. Almadani

Affiliations

Asma A. Alhashmi: Department of Computer Science, Northern Border University, Arar 9280, Saudi Arabia
Abdulbasit A. Darem: Department of Computer Science, Northern Border University, Arar 9280, Saudi Arabia
Abdullah M. Alashjaee: Department of Computer Sciences, Faculty of Computing and Information Technology, Northern Border University, Rafha 91911, Saudi Arabia
Sultan M. Alanazi: Department of Computer Science, Northern Border University, Arar 9280, Saudi Arabia
Tareq M. Alkhaldi: Department of Educational Technologies, Imam Abdulrahman Bin Faisal University, Dammam 34212, Saudi Arabia
Shouki A. Ebad: Department of Computer Science, Northern Border University, Arar 9280, Saudi Arabia
Fuad A. Ghaleb: School of Computing, University Teknologi Malaysia, UTM, Johor Bahru 81310, Johor, Malaysia
Aloyoun M. Almadani: Department of Computer Science, Northern Border University, Arar 9280, Saudi Arabia

DOI: https://doi.org/10.3390/math11132944
Journal volume & issue: Vol. 11, no. 13
p. 2944

Abstract

Read online

This study presents a novel Similarity-Based Hybrid API Malware Detection Model (HAPI-MDM) aiming to enhance the accuracy of malware detection by leveraging the combined strengths of static and dynamic analysis of API calls. Faced with the pervasive challenge of obfuscation techniques used by malware authors, the conventional detection models often struggle to maintain robust performance. Our proposed model addresses this issue by deploying a two-stage learning approach where the XGBoost algorithm acts as a feature extractor feeding into an Artificial Neural Network (ANN). The key innovation of HAPI-MDM is the similarity-based feature, which further enhances the detection accuracy of the dynamic analysis, ensuring reliable detection even in the presence of obfuscation. The model was evaluated using seven machine learning techniques with 10 K-fold cross-validation. Experimental results demonstrated HAPI-MDM’s superior performance, achieving an overall accuracy of 97.91% and the lowest false-positive and false-negative rates compared to related works. The findings suggest that integrating dynamic and static API-based features and utilizing a similarity-based feature significantly improves malware detection performance, thereby offering an effective tool to fortify cybersecurity measures against escalating malware threats.

Published in Mathematics

ISSN: 2227-7390 (Online)
Publisher: MDPI AG
Country of publisher: Switzerland
LCC subjects: Science: Mathematics
Website: http://www.mdpi.com/journal/mathematics

About the journal

Abstract

Keywords