Applied Sciences (Sep 2023)

SbMBR Tree—A Spatiotemporal Data Indexing and Compression Algorithm for Data Analysis and Mining

  • Runda Guan,
  • Ziyu Wang,
  • Xiaokang Pan,
  • Rongjie Zhu,
  • Biao Song,
  • Xinchang Zhang

DOI
https://doi.org/10.3390/app131910562
Journal volume & issue
Vol. 13, no. 19
p. 10562

Abstract

Read online

In the field of data analysis and mining, adopting efficient data indexing and compression techniques to spatiotemporal data can significantly reduce computational and storage overhead for the abilities to control the volume of data and exploit the spatiotemporal characteristics. However, traditional lossy compression techniques are hardly suitable due to their inherently random nature. They often impose unpredictable damage to scientific data, which affects the results of data mining and analysis tasks that require certain precision. In this paper, we propose a similarity-based minimum bounding rectangle (SbMBR) tree, a tree-based indexing and compression method, to address the aforementioned problem. Our method can hierarchically select appropriate minimum bounding rectangles according to the given maximum acceptable errors and use the average value contained in each selected MBR to replace the original data to achieve data compression with multi-layer loss control. This paper also provides the corresponding tree construction algorithm and range query processing algorithm for the indexing structure mentioned above. To evaluate the data quality preservation in cross-domain data analysis and mining scenarios, we use mutual information as the estimation metric. Experimental results emphasize the superiority of our method over some of the typical indexing and compression algorithms.

Keywords