A Comparison of Ensemble and Deep Learning Algorithms to Model Groundwater Levels in a Data-Scarce Aquifer of Southern Africa

Zaheed Gaffoor; Kevin Pietersen; Nebo Jovanovic; Antoine Bagula; Thokozani Kanyerere; Olasupo Ajayi; Gift Wanangwa

doi:10.3390/hydrology9070125

Hydrology (Jul 2022)

A Comparison of Ensemble and Deep Learning Algorithms to Model Groundwater Levels in a Data-Scarce Aquifer of Southern Africa

Zaheed Gaffoor,
Kevin Pietersen,
Nebo Jovanovic,
Antoine Bagula,
Thokozani Kanyerere,
Olasupo Ajayi,
Gift Wanangwa

Affiliations

Zaheed Gaffoor: IBM Research Africa, Johannesburg 2001, South Africa
Kevin Pietersen: Institute for Water Studies, University of the Western Cape, Cape Town 7535, South Africa
Nebo Jovanovic: Department of Earth Sciences, University of the Western Cape, Cape Town 7535, South Africa
Antoine Bagula: Department of Computer Science, University of the Western Cape, Cape Town 7535, South Africa
Thokozani Kanyerere: Department of Earth Sciences, University of the Western Cape, Cape Town 7535, South Africa
Olasupo Ajayi: Department of Computer Science, University of the Western Cape, Cape Town 7535, South Africa
Gift Wanangwa: Groundwater Division, Water Resources Department of the Ministry of Water and Sanitation, Tikwere House, Lilongwe 207203, Malawi

DOI: https://doi.org/10.3390/hydrology9070125
Journal volume & issue: Vol. 9, no. 7
p. 125

Abstract

Read online

Machine learning and deep learning have demonstrated usefulness in modelling various groundwater phenomena. However, these techniques require large amounts of data to develop reliable models. In the Southern African Development Community, groundwater datasets are generally poorly developed. Hence, the question arises as to whether machine learning can be a reliable tool to support groundwater management in the data-scarce environments of Southern Africa. This study tests two machine learning algorithms, a gradient-boosted decision tree (GBDT) and a long short-term memory neural network (LSTM-NN), to model groundwater level (GWL) changes in the Shire Valley Alluvial Aquifer. Using data from two boreholes, Ngabu (sample size = 96) and Nsanje (sample size = 45), we model two predictive scenarios: (I) predicting the change in the current month’s groundwater level, and (II) predicting the change in the following month’s groundwater level. For the Ngabu borehole, GBDT achieved R2 scores of 0.19 and 0.14, while LSTM achieved R2 scores of 0.30 and 0.30, in experiments I and II, respectively. For the Nsanje borehole, GBDT achieved R2 of −0.04 and −0.21, while LSTM achieved R2 scores of 0.03 and −0.15, in experiments I and II, respectively. The results illustrate that LSTM performs better than the GBDT model, especially regarding slightly greater time series and extreme GWL changes. However, closer inspection reveals that where datasets are relatively small (e.g., Nsanje), the GBDT model may be more efficient, considering the cost required to tune, train, and test the LSTM model. Assessing the full spectrum of results, we concluded that these small sample sizes might not be sufficient to develop generalised and reliable machine learning models.

Published in Hydrology

ISSN: 2306-5338 (Online)
Publisher: MDPI AG
Country of publisher: Switzerland
LCC subjects: Science
Website: http://www.mdpi.com/journal/hydrology

About the journal

Abstract

Keywords