Research on Efficient Data Warehouse Construction Methods for Big Data Applications

Zhao Chenggang; Du Junwei; Wang Furong; Li Haojie

doi:10.2478/amns-2024-3275

Applied Mathematics and Nonlinear Sciences (Jan 2024)

Research on Efficient Data Warehouse Construction Methods for Big Data Applications

Zhao Chenggang,
Du Junwei,
Wang Furong,
Li Haojie

Affiliations

Zhao Chenggang: School of Information Science and Technology, Qingdao University of Science and Technology, Qingdao, Shandong, 266061, China.
Du Junwei: School of Information Science and Technology, Qingdao University of Science and Technology, Qingdao, Shandong, 266061, China.
Wang Furong: Gaomi Campus, Qingdao University of Science and Technology, Weifang, Shandong, 261500, China.
Li Haojie: School of Information Science and Technology, Qingdao University of Science and Technology, Qingdao, Shandong, 266061, China.

DOI: https://doi.org/10.2478/amns-2024-3275
Journal volume & issue: Vol. 9, no. 1

Abstract

Read online

In computing application scenarios with large volumes of data, time-efficient data warehouses are the primary choice for most businesses. The metadata module will be designed with MySQL as an intermediate node for information exchange among modules in an efficient data warehouse in this paper. The first and second-layer data scheduling algorithms are utilized to monitor the progress of queries and updates in the data warehouse system in real-time, and to realize the intelligent setting of dynamic priorities for data processing tasks. Subsequently, the data scheduling and execution module is built based on the scheduling algorithm, and the efficient data warehouse system is constructed using the Hadoop open-source computing framework. The results show that each module of the efficient data warehouse system passes the functionality test, and the data processing time in real and synthetic datasets can fully satisfy the actual time requirements of big data processing and data analysis. In addition, the performance of this paper’s data warehouse system is better than the comparison data warehouse system, and the query time of this paper’s system can be reduced by 87.74% compared with the comparison system in the 1-dimensional data dimension of the SD2 dataset. The efficient data warehouse system designed in this paper is able to achieve high throughput and low latency optimization, which improves the efficiency of data processing and provides a reference for related research in the field of big data processing.

Published in Applied Mathematics and Nonlinear Sciences

ISSN: 2444-8656 (Online)
Publisher: Sciendo
Country of publisher: Poland
LCC subjects: Science: Mathematics
Website: https://sciendo.com/journal/AMNS

About the journal

Abstract

Keywords