暴雨灾害 (Oct 2022)

Design and implementation of the meteorological big data computing resource management system based on container platform

  • Daicai YANG,
  • Hua WANG,
  • Junchao WANG,
  • Mu QIAO,
  • Juanjuan Wang,
  • Anwei LAI

DOI
https://doi.org/10.12406/byzh.2022-052
Journal volume & issue
Vol. 41, no. 5
pp. 607 – 612

Abstract

Read online

With the formal business operation of Hubei meteorological big data cloud platform "Tianqing",it has became an urgent problem to improve the operation efficiency of big data algorithm processing and strengthen the allocation of computing resources. This paper discusses the design idea and some key technologies of the meteorological big data computing resource optimization management system from the aspects of the construction requirements,overall architecture and implementation mode of the system,and draws the following conclusions: Based on the B/S architecture,the system takes "Tianqing" as the data center and the container as the object of computing resource supply and optimization management,including two subsystems: processing container scheduling and container platform management; The processing container scheduling subsystem includes account creation,resource authorization,image making,algorithm registration,algorithm loading,task definition,task startup,task log collection and other functions,users can flexibly trigger tasks on the scheduling platform; After the task is triggered,the processing container scheduling subsystem creates a container in the container platform,executes the task,and monitors the use of the container during the task execution in real time; The container platform management subsystem is based on the container orchestration engine of Kubernetes,Compare the resources configured by the algorithm with the allocable resources on the schedulable nodes,completes the assembly of configuration files,the issuance of resource deployment scripts,the pre selection and optimization of nodes,and finally obtains the operation results of the algorithm; Key technologies such as multi node balanced scheduling,algorithm Resource fine matching,container operation resource isolation,mirror storage fault recovery,container and algorithm fault monitoring are comprehensively used to effectively improve the scheduling ability,reliability and resource utilization of container computing resources.

Keywords