Meitan xuebao (May 2024)

Method for filling missing data of mine ventilation parameters

  • Jingfeng NI,
  • Xuefeng LIU,
  • Lijun DENG

DOI
https://doi.org/10.13225/j.cnki.jccs.2023.0481
Journal volume & issue
Vol. 49, no. 5
pp. 2315 – 2323

Abstract

Read online

The intelligent mine ventilation system is very important for the intelligent construction of coal mines. In order to solve the problem of missing mine ventilation parameter data caused by the lack of measurement conditions, instrument signal interference, uneven wind speed of roadway section, improper manual operation and other restrictive factors during actual measurement of mine ventilation parameters, a method for filling the missing data of mine ventilation parameters based on the multiple imputation method of random forest-chained equation was proposed. Multiple imputation with chained equations is used to generate n filled values for each missing attribute value by iterations, resulting in n complete datasets, and a final complete dataset is obtained by analyzing and optimizing the n complete datasets. In order to improve the filling accuracy of missing values, the influence of the uncertainty of missing data of mine ventilation parameters on the analysis process is reasonably considered, and the missing data is filled in the prediction task of random forest in combination with the prediction mean matching model. Taking the Luxin No.2 Mine as an experimental example, the intelligent mine ventilation simulation system IMVS was used to preprocess the original data set of ventilation parameters of the Luxin No.2 Mine to obtain a complete and accurate complete dataset of mine ventilation parameters. Comparative experiments with different missing attributes, different data missing rates, and different number of iterations were conducted separately for the complete data set. The effectiveness of the model was evaluated by a variety of model evaluation indicators. The results show that the complete data set formed by the multiple imputation method of random forest-chained equation has good similarity with the original data set. Results of filling experiments with different missing columns show that the filling model can easily handle mixed data types, autonomously learning the correlations between parameters and thus reducing filling complexity. The n datasets formed after iterations are combined into a final dataset by analysis, which improves the filling accuracy. Experiments with different iterations on the complete data set after initial filling show that the data correlation will converge after a certain number of iterations.

Keywords