IEEE Access (Jan 2023)

Normalized Storage Model Construction and Query Optimization of Book Multi-Source Heterogeneous Massive Data

  • Dailin Wang,
  • Lina Liu,
  • Yali Liu

DOI
https://doi.org/10.1109/ACCESS.2023.3301134
Journal volume & issue
Vol. 11
pp. 96543 – 96553

Abstract

Read online

According to the characteristics of massive, multi-source, heterogeneous, and rapid growth of book literature data information from the perspective of the metaverse, in order to meet the requirements of efficient management and rapid retrieval such as standardized storage, effective extraction, and scientific library construction for unstructured massive and heterogeneous book in-formation, this study focuses on the normalization of multi-source heterogeneous massive book data, the construction of a warehouse model for book data in the metaverse perspective, and the query and optimization of book data. Systematic research and implementation were conducted to solve the problem of how to process, manage, and query multi-source heterogeneous massive book data in the metaverse, improving the utilization value and query efficiency of the data. This study utilized the semi-structured features of book text data to construct an extraction rule model for heterogeneous book data, and effectively extracted massive heterogeneous book information. Based on the HBase distributed storage structure and parallel computing technology, the storage scheme has been optimized and query efficiency has been improved to ensure efficient management and retrieval of massive heterogeneous book data. The experimental results show that compared with traditional methods, there are significant improvements in multiple aspects such as the accuracy and recall rate of book text data extraction, the management methods and query efficiency of book information.

Keywords