Applied Computing and Geosciences (Sep 2024)

Machine Learning model interpretability using SHAP values: Application to Igneous Rock Classification task

  • Antonella S. Antonini,
  • Juan Tanzola,
  • Lucía Asiain,
  • Gabriela R. Ferracutti,
  • Silvia M. Castro,
  • Ernesto A. Bjerg,
  • María Luján Ganuza

Journal volume & issue
Vol. 23
p. 100178

Abstract

Read online

El Fierro intrusive body is one of the bodies that compose the La Jovita–Las Aguilas mafic–ultramafic belt, located in the Sierra Grande de San Luis, Argentina. The units of this belt carry a base metal sulfide (BMS) mineralization and platinum group minerals (PGM). The macroscopic description of mafic and ultramafic rocks, as is usually done by the mining exploration companies, leads to an imprecise modal classification of the rocks. In this study, we develop a random forest-based prediction model, which uses geochemical parameters to classify mafic and ultramafic rocks intercepted by drill cores. This model showed an accuracy of between 86% and 94%, and an f1_score of 96%. Random forest classification is a widely adopted Machine Learning approach to construct predictive models across various research domains. However, as models become more complex, their interpretation can be considerably difficult. To interpret the model results, we use both global and local perspectives, incorporating the SHAP (SHapley Additive exPlanations) method. The SHAP technique allows us to analyze individual samples using force plots, and provides a measure of the importance of each geochemical input attribute in the model output. As a result of analyzing the contribution of each input feature to the model, the three variables with the highest contributions were identified in the following order: Al2O3, MgO, and Sr.

Keywords