Robust Malware Family Classification Using Effective Features and Classifiers

Baraa Tareq Hammad; Norziana Jamil; Ismail Taha Ahmed; Zuhaira Muhammad Zain; Shakila Basheer

doi:10.3390/app12157877

Applied Sciences (Aug 2022)

Robust Malware Family Classification Using Effective Features and Classifiers

Baraa Tareq Hammad,
Norziana Jamil,
Ismail Taha Ahmed,
Zuhaira Muhammad Zain,
Shakila Basheer

Affiliations

Baraa Tareq Hammad: College of Computer Sciences and Information Technology, University of Anbar, Anbar 55431, Iraq
Norziana Jamil: College of Computing and Informatics, University Tenaga Nasional, Kajang 43000, Selangor, Malaysia
Ismail Taha Ahmed: College of Computer Sciences and Information Technology, University of Anbar, Anbar 55431, Iraq
Zuhaira Muhammad Zain: Department of Information Systems, College of Computer and Information Science, Princess Nourah Bint Abdulrahman University, P.O. Box 84428, Riyadh 11671, Saudi Arabia
Shakila Basheer: Department of Information Systems, College of Computer and Information Science, Princess Nourah Bint Abdulrahman University, P.O. Box 84428, Riyadh 11671, Saudi Arabia

DOI: https://doi.org/10.3390/app12157877
Journal volume & issue: Vol. 12, no. 15
p. 7877

Abstract

Read online

Malware development has significantly increased recently, posing a serious security risk to both consumers and businesses. Malware developers continually find new ways to circumvent security research’s ongoing efforts to guard against malware attacks. Malware Classification (MC) entails labeling a class of malware to a specific sample, while malware detection merely entails finding malware without identifying which kind of malware it is. There are two main reasons why the most popular MC techniques have a low classification rate. First, Finding and developing accurate features requires highly specialized domain expertise. Second, a data imbalance that makes it challenging to classify and correctly identify malware. Furthermore, the proposed malware classification (MC) method consists of the following five steps: (i) Dataset preparation: 2D malware images are created from the malware binary files; (ii) Visualized Malware Pre-processing: the visual malware images need to be scaled to fit the CNN model’s input size; (iii) Feature extraction: both hand-engineering (Tamura) and deep learning (GoogLeNet) techniques are used to extract the features in this step; (iv) Classification: to perform malware classification, we employed k-Nearest Neighbor (KNN), Support Vector Machines (SVM), and Extreme Learning Machine (ELM). The proposed method is tested on a standard Malimg unbalanced dataset. The accuracy rate of the proposed method was extremely high, making it the most efficient option available. The proposed method’s accuracy rate was outperformed both the Hand-crafted feature and Deep Feature techniques, at 95.42 and 96.84 percent.

Published in Applied Sciences

ISSN: 2076-3417 (Online)
Publisher: MDPI AG
Country of publisher: Switzerland
LCC subjects: Technology: Engineering (General). Civil engineering (General); Science: Biology (General); Science: Physics; Science: Chemistry
Website: http://www.mdpi.com/journal/applsci

About the journal

Abstract

Keywords