Xibei Gongye Daxue Xuebao (Jun 2024)
Operator parallel optimization strategy for distributed databases
Abstract
With the continuous development of network technology, the scale of data has shown explosive growth, which leads gradually to replacing traditional single machine databases with distributed databases. Distributed databases solve large-scale data storage problems through collaborative work among nodes, but due to increased communication costs between nodes, its query efficiency is not as good as a standalone database. In a distributed architecture, the data of storage nodes is only used as redundancy for multiple backups, providing data recovery in case of system failure, and it is not utilized to improve query efficiency. In response to the above issues, this article proposes an operator parallel optimization strategy for distributed databases. By splitting key physical operators, the split sub requests are evenly distributed to multiple nodes in the storage layer, which are processed in parallel by multiple nodes, thereby reducing query response time. The above strategy has been applied on distributed database CBase, and experiments have shown that the parallel optimization strategy proposed in this paper can significantly shorten SQL request query time and improve system resource utilization.
Keywords