Environmental Research Letters (Jan 2021)
Exploring the exceptional performance of a deep learning stream temperature model and the value of streamflow data
Abstract
Stream water temperature ( T _s ) is a variable of critical importance for aquatic ecosystem health. T _s is strongly affected by groundwater-surface water interactions which can be learned from streamflow records, but previously such information was challenging to effectively absorb with process-based models due to parameter equifinality. Based on the long short-term memory (LSTM) deep learning architecture, we developed a basin-centric lumped daily mean T _s model, which was trained over 118 data-rich basins with no major dams in the conterminous United States, and showed strong results. At a national scale, we obtained a median root-mean-square error of 0.69°C, Nash–Sutcliffe model efficiency coefficient of 0.985, and correlation of 0.994, which are marked improvements over previous values reported in literature. The addition of streamflow observations as a model input strongly elevated the performance of this model. In the absence of measured streamflow, we showed that a two-stage model could be used, where simulated streamflow from a pre-trained LSTM model ( Q _sim ) still benefited the T _s model even though no new information was brought directly into the inputs of the T _s model. The model indirectly used information learned from streamflow observations provided during the training of Q _sim , potentially to improve internal representation of physically meaningful variables. Our results indicate that strong relationships exist between basin-averaged forcing variables, catchment attributes, and T _s that can be simulated by a single model trained by data on the continental scale.
Keywords