IEEE Access (Jan 2023)
PowerDP: De-Obfuscating and Profiling Malicious PowerShell Commands With Multi-Label Classifiers
Abstract
In recent years, PowerShell has become the common tool that helps attackers launch targeted attacks using living-off-the-land tactics and fileless attack techniques. Unfortunately, malware-derived PowerShell Commands (PSCmds) have typically been obfuscated to hide the malicious intent from detection and analysis. Also, malicious PSCmds’ expansive use of multiple obfuscation strategies and encryption methods makes them difficult to be revealed. Despite the advances in malicious PSCmds detection incorporating new approaches such as machine learning and deep learning, there is still no consensus on the solution to de-obfuscating malicious PSCmds and profiling their behavior. To address this challenge, we propose a hybrid framework that combines deep learning and program analysis for automatic PowerShell De-obfuscation and behavioral Profiling (PowerDP) through multi-label classification in a static manner. First, we use character distribution features to forecast obfuscation types of malicious PSCmds. Second, we developed an extensive de-obfuscator utilizing static regular expression replacement to recover the original content of obfuscated PSCmds based on the predicted obfuscation types. Finally, we profile the behavior of PSCmds by features extracted from the abstract syntax tree of PSCmds after de-obfuscation. Our results show that PowerDP achieves a promising 99.82% accuracy and 0.18% hamming loss in obfuscation multi-label classification using deep learning. Furthermore, the successful recovery rate of the de-obfuscator against 15 obfuscation types is 98.11% on average with semantic similarity comparison, and the accuracy of the behavior multi-label classification for identifying 5 behaviors in malicious PSCmds averages 98.53%. The evaluation indicates that PowerDP is able to classify and profile complicated PSCmds.
Keywords