IEEE Access (Jan 2022)
Identification of Return-Oriented Programming Attacks Using RISC-V Instruction Trace Data
Abstract
An increasing number of embedded systems include dedicated neural hardware. To benefit from this specialized hardware, deep learning techniques to discover malware on embedded systems are needed. This effort evaluated candidate machine learning detection techniques for distinguishing exploited from non-exploited RISC-V program behavior using execution traces. We first developed a dataset of execution traces containing Return Oriented Programming (ROP) exploitation on the RISC-V Instruction Set Architecture (ISA) and then developed several deep learning bidirectional Long Short-Term Memory (LSTM) models capable of distinguishing exploited traces from non-exploited traces, each using subsets of features from the execution traces. An objective of this effort was to evaluate which features (instruction addresses and immediate values) from an execution trace are application-specific, which features (opcodes and operands) are application-agnostic, and how these subsets of features affect model performance. Application-agnostic features allow a model to generalize its detection capability to detecting ROP in previously unseen applications. The model using opcode and operand sequences obtained 98.21% cross validation accuracy and 97.94% test accuracy. In contrast, a model using address values obtained 92.79% cross validation accuracy with 99.59% test set accuracy. This research also analyzed whether ROPs exploitation significantly affects branch prediction; experimental evidence suggests that it does. Thus, branch prediction behavior could be a valuable feature in detecting ROPs exploits.
Keywords