Applied Sciences (Aug 2023)
Inference Latency Prediction Approaches Using Statistical Information for Object Detection in Edge Computing
Abstract
To seamlessly deliver artificial intelligence (AI) services using object detection, both inference latency from a system perspective as well as inference accuracy should be considered important. Although edge computing can be applied to efficiently operate these AI services by significantly reducing inference latency, deriving an optimized computational offloading policy for edge computing is a challenging problem. In this paper, we propose inference latency prediction approaches for determining the optimal offloading policy in edge computing. Since there is no correlation between the image size and inference latency during object detection, approaches to predict inference latency are required for finding the optimal offloading policy. The proposed approaches predict the inference latency between devices and object detection algorithms by using their statistical information on the inference latency. By exploiting the predicted inference latency, a client may efficiently determine whether to execute an object detection task locally or remotely. Through various experiments, the performances of predicted inference latency according to the object detection algorithms are compared and analyzed by considering two communication protocols in terms of the root mean square error. The simulation results show that the predicted inference latency matches the actual inference latency well.
Keywords