Hydrology and Earth System Sciences (Oct 2024)

Assessing groundwater level modelling using a 1-D convolutional neural network (CNN): linking model performances to geospatial and time series features

  • M. Gomez,
  • M. Gomez,
  • M. Nölscher,
  • A. Hartmann,
  • S. Broda

DOI
https://doi.org/10.5194/hess-28-4407-2024
Journal volume & issue
Vol. 28
pp. 4407 – 4425

Abstract

Read online

Groundwater level (GWL) forecasting with machine learning has been widely studied due to its generally accurate results and low input data requirements. Furthermore, machine learning models for this purpose can be set up and trained quickly compared to the effort required for process-based numerical models. Despite demonstrating high performance at specific locations, applying the same model architecture to multiple sites across a regional area can lead to varying accuracies. The reasons behind this discrepancy in model performance have been scarcely examined in previous studies. Here, we explore the relationship between model performance and the geospatial and time series features of the sites. Using precipitation (P) and temperature (T) as predictors, we model monthly groundwater levels at approximately 500 observation wells in Lower Saxony, Germany, applying a 1-D convolutional neural network (CNN) with a fixed architecture and hyperparameters tuned for each time series individually. The GWL observations range from 21 to 71 years, resulting in variable test and training dataset time ranges. The performances are evaluated against selected geospatial characteristics (e.g. land cover, distance to waterworks, and leaf area index) and time series features (e.g. autocorrelation, flat spots, and number of peaks) using Pearson correlation coefficients. Results indicate that model performance is negatively influenced at sites near waterworks and densely vegetated areas. Longer subsequences of GWL measurements above or below the mean negatively impact the model accuracy. Besides, GWL time series containing more irregular patterns and with a higher number of peaks might lead to higher model performances, possibly due to a closer link with precipitation dynamics. As deep learning models are known to be black-box models missing the understanding of physical processes, our work provides new insights into how geospatial and time series features link to the input–output relationship of a GWL forecasting model.