ICT Express (Dec 2022)
Sumav: Fully automated malware labeling
Abstract
Multiple AV engines are used to ensure more effective system protection against malicious files. These AV engines are capable of distinguishing between benign and malicious files, but even if a file of interest is proven to be malicious, it is still necessary to refer to a list of AV labels provided by each AV engine to determine what family name the malicious file belongs to. However, oftentimes, such AV labels lack a consistent naming scheme, and even family names differ from one AV engine to another.The present study presents Sumav, a fully automated labeling tool that assigns each file a family name based on AV labels. According to previous studies, such a task required prior knowledge or malicious file datasets that had already been labeled. In contrast, Sumav can assign family names with only the AV labels. This system also requires no maintenance and can provide high-quality labeling performance even if sudden changes have been made to the AV label system.