IEEE Access (Jan 2020)

Fast Piecewise Polynomial Fitting of Time-Series Data for Streaming Computing

  • Jianhua Gao,
  • Weixing Ji,
  • Lulu Zhang,
  • Senhao Shao,
  • Yizhuo Wang,
  • Feng Shi

DOI
https://doi.org/10.1109/ACCESS.2020.2976494
Journal volume & issue
Vol. 8
pp. 43764 – 43775

Abstract

Read online

Streaming computing attracts intense attention because of the demand for massive data analyzing in real-time. Due to unbounded and continuous input, the volume of streaming data is so high that all the data cannot be permanently stored. Piecewise polynomial fitting is a popular data compression method that approximately represents the raw data stream with multiple polynomials. The polynomial coefficients corresponding to the best-fitting curve can be calculated by the method of least squares, which minimizes the sum of the squared residuals between observed and fitted values. However, built on several matrix calculations, the method of least squares always leads to high time complexity and is difficult to be applied to streaming computing. This paper puts forward a fast piecewise polynomial fitting for time-series data in streaming computing. The input data stream is dynamically segmented according to a given residual bound. Meanwhile, the data points in each segment are fitted using an improved polynomial fitting method, which has less time overhead than general polynomial fitting by reusing the intermediate calculation results. Experimental results on four time-series datasets show that our algorithm can achieve the highest speedup to the general piecewise polynomial fitting of 2.82x for periodically sampled time-series data and 1.85x for aperiodically sampled time-series data, without affecting the compression ratio and fitting accuracy. Moreover, the event-time latency comparison in a streaming environment indicates that the improved method can endure higher throughput than general piecewise polynomial fitting with the same latency.

Keywords