Geosystems and Geoenvironment (May 2024)

A combination of multivariate statistics and machine learning techniques in groundwater characterization and quality forecasting

  • Mahamuda Abu,
  • Rabiu Musah,
  • Musah Saeed Zango

Journal volume & issue
Vol. 3, no. 2
p. 100261

Abstract

Read online

Globally, the quality of groundwater has proven to have been affected by some natural and human activities in recent years. To ensure there is good drinking water (Sustainable Development Goal 6.3, there is a need to elucidate the groundwater quality status of the area of interest. The groundwater in the northwestern parts of Ghana is not yet well characterized. Hence, this study employed a multi-method approach of hydrochemistry, water quality index (WQI), multivariate statistics, and machine models: multiple linear regression (MLR), decision tree regression (DTR), random forest regression (RFR), and artificial neural network (ANN), are combined in the characterization and prediction of the water quality in the area. They are robust in providing conclusions on groundwater assessment that can be relied upon for decision-making processes regarding groundwater usage and monitoring. Except for NO3− and TDS exceeding their standard levels in 22 and 2 locations, respectively, the other physicochemical parameters are within acceptable limits. The groundwater is generally good for domestic usage based on the WQI, with 79.2% of excellent to good waters. The groundwater evolved from Na-type, Cl-type, and Cl(SO4)-Ca(Mg) facies. Agricultural activities are the main source of human impact on the groundwater. Silicate mineral dissolution and ion exchange processes are the natural processes that affect groundwater mineralization, with mineral dissolution being the dominant process. Based on the performance metrics: MAE, MSE and RMSE of the ML methods considered in the WQI forecasting, the order of performance of the models is ANN > RFR > DTR > MLR, with the following respective R2 values 0.9974, 0.9193, 0.8966 and 0.8886.

Keywords