Artificial Intelligence in Geosciences (Dec 2023)

Deriving big geochemical data from high-resolution remote sensing data via machine learning: Application to a tailing storage facility in the Witwatersrand goldfields

  • Steven E. Zhang,
  • Glen T. Nwaila,
  • Julie E. Bourdeau,
  • Yousef Ghorbani,
  • Emmanuel John M. Carranza

Journal volume & issue
Vol. 4
pp. 9 – 21

Abstract

Read online

Remote sensing data is a cheap form of surficial geoscientific data, and in terms of veracity, velocity and volume, can sometimes be considered big data. Its spatial and spectral resolution continues to improve over time, and some modern satellites, such as the Copernicus Programme's Sentinel-2 remote sensing satellites, offer a spatial resolution of 10 m across many of their spectral bands. The abundance and quality of remote sensing data combined with accumulated primary geochemical data has provided an unprecedented opportunity to inferentially invert remote sensing data into geochemical data. The ability to derive geochemical data from remote sensing data would provide a form of secondary big geochemical data, which can be used for numerous downstream activities, particularly where data timeliness, volume and velocity are important. Major benefactors of secondary geochemical data would be environmental monitoring and applications of artificial intelligence and machine learning in geochemistry, which currently entirely relies on manually derived data that is primarily guided by scientific reduction. Furthermore, it permits the usage of well-established data analysis techniques from geochemistry to remote sensing that allows useable insights to be extracted beyond those typically associated with strictly remote sensing data analysis. Currently, no generally applicable and systematic method to derive chemical elemental concentrations from large-scale remote sensing data have been documented in geosciences. In this paper, we demonstrate that fusing geostatistically-augmented geochemical and remote sensing data produces an abundance of data that enables a more generalized machine learning-based geochemical data generation. We use gold grade data from a South African tailing storage facility (TSF) and data from both the Landsat-8 and Sentinel remote sensing satellites. We show that various machine learning algorithms can be used given the abundance of training data. Consequently, we are able to produce a high resolution (10 m grid size) gold concentration map of the TSF, which demonstrates the potential of our method to be used to guide extraction planning, online resource exploration, environmental monitoring and resource estimation.

Keywords