IEEE Access (Jan 2019)
Improving NoSQL Storage Schema Based on Z-Curve for Spatial Vector Data
Abstract
NoSQL database can provide massive, high concurrency, and scalable services for storing different types of data. HBase, a type of NoSQL database, in which columns are grouped into column families, is very suitable for storing semi-structured or unstructured spatial vector data. However, since there are few rules and constraints to be followed for the NoSQL database, the design of storage schema for spatial data based on NoSQL is difficult. In this paper, based on our early work, an improved Z-curve storage schema is proposed for spatial vector data. According to our new schema, row key of a geometric object is the Z-curve code of the spatial grids intersected with the geometric object. Moreover, geometric objects with the same row key are stored in a column family. Our proposed method has two features. First, geometric objects adjacent in the location are adjacent in physical storage. Second, redundant exists in storage for improving query accuracy. In our experiments, we compare the improved Z-curve storage schema with a Quadtree storage schema, an R-tree storage schema, and the previous Z-curve storage schema. Query response time, memory usage, and the query accuracy of spatial query on point and range are used to verify the validity of our proposed method. The experimental results show that the two storage schemas based on Z-curve achieve higher query efficiency than the two storage schemas based on tree-the Quadtree storage schema and the R-tree storage schema. More importantly, the query results of the improved Z-curve schema are completely correct, while the query results of the previous Z-curve schema are not.
Keywords