Gong-kuang zidonghua (May 2023)

Massive data mining and analysis platform design for fully mechanized working face

  • WANG Hongwei,
  • YANG Kun,
  • FU Xiang,
  • LI Jin,
  • JIA Sifeng

DOI
https://doi.org/10.13272/j.issn.1671-251x.18088
Journal volume & issue
Vol. 49, no. 5
pp. 30 – 36, 126

Abstract

Read online

The current real-time and integrity of massive data acquisition in fully mechanized working faces are poor. The abnormal data cleaning takes a long time. The data mining delays are large. This leads to low utilization rate of fully mechanized working data and incapability to assist management in issuing decision-making instructions in real-time. In order to solve the above problems, a massive data mining and analysis platform for fully mechanized working faces is designed. The platform consists of a data source layer, a data acquisition and storage layer, a data mining layer, and a front-end application layer. The data source layer is provided with raw data by various hardware devices on the working surface. The data acquisition and storage layer uses the OPC UA gateway to collect real-time monitoring information from underground sensors, and then stores the data in the InfluxDB storage engine through the MQTT protocol and RESTful interface. The data mining layer uses the Hive data engine and Yarn resource manager to filter out abnormal data caused by workplace interference during the data acquisition process. It solves the problem of local data acquisition order disorder caused by network latency. The Spark distributed mining engine is used to explore the potential value of massive working condition data in the working face device group, improving the running speed of the data mining model. The front-end application layer utilizes visual components to associate with the back-end database. It interacts with the back-end data in real-time through AJAX technology to achieve visual display of model mining results and various monitoring data. The test results show that the platform can fully ensure the real-time and integrity of data acquisition. The cleaning efficiency is 5 times better than a standalone MySQL query engine and the mining efficiency is 4 times better than a standalone Python mining engine.

Keywords