Meikuang Anquan (Feb 2023)
Collapsed column identification model based on K-means SMOTE and random forest algorithm
Abstract
In order to overcome the problem of multiple solutions and uncertainties in the identification of collapsed columns with a single seismic attribute and the problem of identification accuracy shift caused by unbalanced sample data, a binary classification collapsed column based on K-means SMOTE and random forest was constructed. The model can identify collapse columns by joint analysis of multiple seismic attributes. Taking the southern mining area of the east wing of the first mining area of Shanxi Xinyuan Coal Company as the research area, 12 seismic attributes extracted by the front interpreters through 3D seismic exploration technology are used as sample features, and the actually revealed collapse column information is used as sample labels to build a seismic multi-attribute attribute dataset; seismic attribute selection is carried out through correlation analysis, cluster analysis evaluation and random forest importance analysis, and 6 relatively independent seismic attributes are finally selected as sample features; the K-means SMOTE algorithm is used to balance the data set, and 8 992 data are obtained, of which 6 294 data are selected as the training set and 2 698 data are used as the test set; the random forest binary classification model is built based on the python language platform, and the final accuracy of predicting the collapsed column can reach 87%. By comparing three common machine learning classification algorithms, the model identified collapsed columns with higher accuracy.
Keywords