Unconventional Resources (Jan 2022)
Shale lithology identification using stacking model combined with SMOTE from well logs
Abstract
Shale lithology identification is the basis of geological research and reservoir characterization, and is an essential task for oil exploration. Recently, several machine learning algorithms have been applied to improve the accuracy of lithology identification. However, stacking model for lithology identification has been less used in existing studies, and less consideration was given to imbalanced lithologies problem. In this study, we build a stacking model based on random forest (RF), extreme gradient boosting (XGBoost) and linear regression (LR), and use synthetic minority oversampling technique (SMOTE) to improve the imbalanced lithologies problem, and then compare the stacking model with support vector machine (SVM), RF and XGBoost models after adjusting model parameters using grid search and fivefold cross-validation. The authors prepared a dataset consisting of logging data and core data from 13 wells in a depression in Junggar basin, China, including a total of 2352 sample points marked with lithologic labels. The lithologies identified in this study are mudstone (MS), dolomitic mudstone (DM), siltstone (S), dolomitic siltstone (DS) and micritic dolomite (MD). The results show that (1) the overall identification performance of the stacking model is better than that of the SVM, RF and XGBoost models. (2) SMOTE algorithm can effectively improve the identification performance of the minority lithologies. (3) Density log is the most important factor in identifying lithologies. The stacking model combined with SMOTE proposed in this paper has high lithology identification performance, which renders it practicable for lithology identification.