Journal of King Saud University: Computer and Information Sciences (Jan 2023)

Opt-CoInfer: Optimal collaborative inference across IoT and cloud for fast and accurate CNN inference

  • Zhanhua Zhang,
  • Hanqiao Yu,
  • Fangzhou Wang

Journal volume & issue
Vol. 35, no. 1
pp. 438 – 448

Abstract

Read online

For fast and accurate Convolutional Neural Network (CNN) inference of massive Internet of Things (IoT) data, Collaborative Inference (CI) based on partition and compression techniques needs to carefully select the collaboration scheme considering both application scenario and inference requirement. However, with emerging partition and compression techniques, it is hard to select a proper collaboration scheme due to the exponential growth of the scheme space and the time-consuming evaluation of a single scheme. Therefore, we present Opt-CoInfer to search the optimal collaboration scheme for collaborative inference (i.e., the fastest/most accurate scheme satisfying the latency/accuracy requirement in any specific scenario). Generally, Opt-CoInfer achieves the optimum by updating the local optimum and reducing the feasible set to empty in an iterative method. Specifically, in each iteration, Opt-CoInfer selects the promising scheme and updates the local optimum according to a statistical model built with the evaluated schemes, then reduces the feasible set by the local optimum and some key observations of collaboration schemes. Experiments show that Opt-CoInfer outperforms single-end and state-of-the-art CI approaches with the same inference requirements in various application scenarios (e.g., up to 3.49× faster with the same accuracy requirement and as low as 37.41% accuracy loss with the same latency requirement).

Keywords