Frontiers in Earth Science (Oct 2021)

Developing an XGBoost Regression Model for Predicting Young’s Modulus of Intact Sedimentary Rocks for the Stability of Surface and Subsurface Structures

  • Niaz Muhammad Shahani,
  • Niaz Muhammad Shahani,
  • Xigui Zheng,
  • Xigui Zheng,
  • Xigui Zheng,
  • Xigui Zheng,
  • Cancan Liu,
  • Cancan Liu,
  • Fawad Ul Hassan,
  • Fawad Ul Hassan,
  • Peng Li,
  • Peng Li

DOI
https://doi.org/10.3389/feart.2021.761990
Journal volume & issue
Vol. 9

Abstract

Read online

Young’s modulus (E) is essential for predicting the behavior of materials under stress and plays an important role in the stability of surface and subsurface structures. E has a wide range of applications in mining, geology, civil engineering, etc.; for example, coal and metal mines, tunnels, foundations, slopes, bridges, buildings, drilling, etc. This study developed a novel machine learning regression model, namely an extreme gradient boosting (XGBoost) to predict the influences of four inputs such as uniaxial compressive strength in MPa; density in g/cm3; p-wave velocity (Vp) in m/s; and s-wave velocity in m/s on two outputs, namely static Young’s modulus (Es) in GPa; and dynamic Young’s modulus (Ed) in GPa. Using a series of basic statistical analysis tools, the accompanying strengths of each input and each output were systematically examined to classify the most prevailing and significant input parameters. Then, two other models i.e., multiple linear regression (MLR) and artificial neural network (ANN) were employed to predict Es and Ed. Next, multiple linear regression and ANN were compared with XGBoost. The original dataset was allocated as 70% for the training stage and 30% for the testing stage for each model. To improve the performance of the developed models, an iterative 10-fold cross-validation method was used. Therefore, based on the results XGBoost model has revealed the best performance with high accuracy (Es: correlation coefficient (R2) = 0.998; Ed: R2 = 0.999 in the training stage; Es: R2 = 0.997; Ed: R2 = 0.999 in the testing stage), root mean square error (RMSE) (Es: RMSE = 0.0652; Ed: RMSE = 0.0062 in the training stage; Es: RMSE = 0.071; Ed: RMSE = 0.027 in the testing stage), RMSE-standard deviation ratio (RSR) index value (Es: RSR = 0.00238; Ed: RSR = 0.00023 in the training stage; Es: RSR = 0.00304; Ed: RSR = 0.001 in the testing stage) and variance accounts for (VAF) (Es: VAF = 99.71; Ed: VAF = 99.99 in the training stage; Es: VAF = 99.83; Ed: VAF = 99.94 in the testing stage) compared to the other developed models in this study. Using a novel machine learning approach, this study was able to deliver substitute elucidations for predicting Es and Ed parameters with suitable accuracy and runtime.

Keywords