PLoS ONE (Jan 2023)

Fast and accurate interpretation of workload classification model.

  • Sooyeon Shim,
  • Doyeon Kim,
  • Jun-Gi Jang,
  • Suhyun Chae,
  • Jeeyong Lee,
  • U Kang

DOI
https://doi.org/10.1371/journal.pone.0282595
Journal volume & issue
Vol. 18, no. 3
p. e0282595

Abstract

Read online

How can we interpret predictions of a workload classification model? A workload is a sequence of operations executed in DRAM, where each operation contains a command and an address. Classifying a given sequence into a correct workload type is important for verifying the quality of DRAM. Although a previous model achieves a reasonable accuracy on workload classification, it is challenging to interpret the prediction results since it is a black box model. A promising direction is to exploit interpretation models which compute the amount of attribution each feature gives to the prediction. However, none of the existing interpretable models are tailored for workload classification. The main challenges to be addressed are to 1) provide interpretable features for further improving interpretability, 2) measure the similarity of features for constructing the interpretable super features, and 3) provide consistent interpretations over all instances. In this paper, we propose INFO (INterpretable model For wOrkload classification), a model-agnostic interpretable model which analyzes workload classification results. INFO provides interpretable results while producing accurate predictions. We design super features to enhance interpretability by hierarchically clustering original features used for the classifier. To generate the super features, we define and measure the interpretability-friendly similarity, a variant of Jaccard similarity between original features. Then, INFO globally explains the workload classification model by generalizing super features over all instances. Experiments show that INFO provides intuitive interpretations which are faithful to the original non-interpretable model. INFO also shows up to 2.0× faster running time than the competitor while having comparable accuracies for real-world workload datasets.