Computers (Sep 2023)
Rapidrift: Elementary Techniques to Improve Machine Learning-Based Malware Detection
Abstract
Artificial intelligence and machine learning have become a necessary part of modern living along with the increased adoption of new computational devices. Because machine learning and artificial intelligence can detect malware better than traditional signature detection, the development of new and novel malware aiming to bypass detection has caused a challenge where models may experience concept drift. However, as new malware samples appear, the detection performance drops. Our work aims to discuss the performance degradation of machine learning-based malware detectors with time, also called concept drift. To achieve this goal, we develop a Python-based framework, namely Rapidrift, capable of analysing the concept drift at a more granular level. We also created two new malware datasets, TRITIUM and INFRENO, from different sources and threat profiles to conduct a deeper analysis of the concept drift problem. To test the effectiveness of Rapidrift, various fundamental methods that could reduce the effects of concept drift were experimentally explored.
Keywords