IEEE Access (Jan 2021)
Performance Prediction Method for Stream Computing Platform Based on Time Series
Abstract
As one of the most popular high-performance data processing technology, existing task and resource scheduling strategies for stream computing platforms are suffering from the problem of triggering hysteresis, which seriously affects the cluster performance. To address this problem, the idea of performance prediction based on timeline series data is proposed. Firstly, the performance variation rule of the stream processing platform is analyzed, which provides a basis for proposing the performance prediction model. Secondly, the basic topology paradigm, periodic load prediction model, and real-time throughput prediction model are proposed as the theoretical foundation for performance prediction. Thirdly, the performance prediction algorithm is proposed to predict the variation trend of the processing load and throughput of the cluster. The processing load is predicted periodically while the throughput of the cluster is predicted in a real-time manner. Finally, the performance evaluation algorithm is proposed to evaluate the cluster performance based on the prediction results and trigger the corresponding scheduling strategies in advance. The experimental results showed that the prediction accuracy of the proposed algorithm meets the requirements in practical applications. Meanwhile, the proposed method improves the performance of stream computing performance by triggering the scheduling strategies in advance.
Keywords