Advances in Mechanical Engineering (Nov 2016)

Research and realization of improved extract–transform–load scheduler in China Southern Power Grid

  • Li Guo,
  • Huang Wenqi,
  • Yuan Xiaokai,
  • Zhang Fuzheng,
  • Chen Chengzhi,
  • Chen Shitao

DOI
https://doi.org/10.1177/1687814016679055
Journal volume & issue
Vol. 8

Abstract

Read online

Applications of big data techniques in power system will make contributions to the sustainable development and robust establishment of China Southern Power Grid; thus, it is necessary that a new framework of China Southern Power Grid big data platform is constructed. Apart from key technologies, like data analysis, data process, and data visualization, the integration and fusion problem in the data warehouse plays an important role in the data analysis and mining with high quality. In order to minimize the operation time and memory consumption, various scheduling strategies of extract–transform–load workflows are proposed, including round-robin algorithm, minimum-cost algorithm, minimum-memory algorithm, and mixture of the minimum-cost and minimum-memory algorithm. In combination with above algorithms, a workflow is divided into many subflows by effective algorithms, like shortest-subflow-first and priority-backfilling algorithms, which can further improve the parallel computation ability. Then, the minimum-cost and minimum-memory with shortest-subflow-first algorithm, the minimum-cost and minimum-memory with priority-backfilling algorithm, and the minimum-cost and minimum-memory with shortest-subflow-first and priority-backfilling algorithm are established, which are designed to schedule subflows. Finally, aiming at characteristics of China Southern Power Grid big data, different performance indexes are cited to evaluate above algorithms, and the experiment results show that the minimum-cost and minimum-memory with shortest-subflow-first and priority-backfilling algorithm is superior to the hybrid prioritization algorithm based on the rank level of each task (hybrid), online workflow management, minimum-cost and minimum-memory with shortest-subflow-first, and the minimum-cost and minimum-memory with priority-backfilling algorithm, and the system robust performance is also significantly met and improved.