Gong-kuang zidonghua (Feb 2023)

Research on distributed storage of 3D stack grid model of coal mine geology based on HDF5

  • GUO Jun

DOI
https://doi.org/10.13272/j.issn.1671-251x.18056
Journal volume & issue
Vol. 49, no. 1
pp. 153 – 161

Abstract

Read online

The realization of multi-resolution expression and multi-parameter fusion of coal mine geological environment by using true 3D gridded geological model is one of the key contents of coal mine geological big data research. The core issues are the organization, storage and management of 3D geological model data. Aiming at the data scale, distributed storage and query performance of 3D geological grid model in coal mines, a distributed storage scheme of 3D stack grid model based on HDF5 is proposed. In terms of grid data organization, the 3D geological model data is compressed and organized in blocks by using the stack grid model. The problem of large-scale geological grid model data organization is solved by data segmentation. The data segmentation also concentrates the data with similar space in the adjacent hard disk sector or storage device. It is conducive to improving the efficiency of data scheduling. In terms of data storage, HDF5 is used as the persistence layer of storage to store all original data. The memory database Redis is used to store hot data, HDF5 metadata and other related information. In terms of Web services, H5Serv is used to send and receive HDF5 data. In terms of HDF5 distribution, network file system (NFS) is used to realize the sharing of HDF5 data between different node servers. Rsync and Inotify are used to realize real-time synchronization of HDF5 data in different node servers. Nginx is used to realize load balancing of reverse proxy and data service nodes during access. The Docker container technology is used to uniformly deploy the data node service and Nginx service. The JupyterLab interactive analysis platform is used to realize the scheduling and management of real-time data resources. The experimental results show that the data organization of the geological model based on the stack grid and the distributed storage based on HDF5 can realize the effective storage management and spatial query of 3D geological grid model of the coal mine. Compared with the voxel model and octree model, the data volume of the stack grid model is small. It is convenient to realize the spatial quick query of the geological interface. The spatial query performance is better than the relational database MySQL and the non-relational database MongoDB. The stack grid model is more suitable for the grid expression and data organization of the coal measures sedimentary stratigraphic structure. The file storage based on HDF5 is significantly more space-saving than MySQL and MongoDB database storage. The main reason is that the DataSet of HDF5 can directly store data blocks without additional storage information. The data organization and storage scheme based on stack grid model and HDF5 can provide references for the effective storage management of 3D geological grid model in coal mines.

Keywords