Hydrology and Earth System Sciences (Jul 2023)

Data worth analysis within a model-free data assimilation framework for soil moisture flow

  • Y. Wang,
  • X. Hu,
  • L. Wang,
  • J. Li,
  • L. Lin,
  • K. Huang,
  • L. Shi

DOI
https://doi.org/10.5194/hess-27-2661-2023
Journal volume & issue
Vol. 27
pp. 2661 – 2680

Abstract

Read online

Conventional data worth (DW) analysis for soil water problems depends on physical dynamic models. The widespread occurrence of model structural errors and the strong nonlinearity of soil water flow may lead to biased or wrong worth assessment. By introducing the nonparametric data worth analysis (NP-DWA) framework coupled with the ensemble Kalman filter (EnKF), this real-world case study attempts to assess the worth of potential soil moisture observations regarding the reconstruction of fully data-driven soil water flow models prior to data gathering. The DW of real-time soil moisture observations after Gaussian process training and Kalman update was quantified with three representative information metrics, including the trace, Shannon entropy difference and relative entropy. The sequential NP-DWA framework was examined by a number of cases in terms of the variable of interest, spatial location, observation error, and prior data content. Our results indicated that, similarly to the traditional DW analysis based on physical models, the overall increasing trend of the DW from the sequential augmentation of additional observations within the NP-DWA framework was also susceptible to interruptions by localized surges due to never-experienced atmospheric conditions (i.e., rainfall events). The difference is that this biased DW in the former is caused by model structural errors triggered by contrasting scenarios, which is difficult to be compensated for by assimilating more prior data, while this performance degradation in the NP-DWA can be effectively alleviated by enriching training scenarios or the appropriate amplification of observational noise under extreme meteorological conditions. Nevertheless, a substantial expansion of the prior data content may cause an unexpected increase in the DW of future potential observations due to the possible introduction of ensuing observation noises. Hence, high-quality and representative small data may be a better choice than unfiltered big data. Compared with the observations in the surface layer with the strongest time variability, the soil water content in the middle layer robustly exhibited remarkable superiority in the construction of model-free soil moisture models. We also demonstrated that the DW assessment performance was jointly determined by 3C, i.e., the capacity of potential observation realizations to capture actual observations, the correlation of potential observations with the variables of interest and the choice of DW indicators. Direct mapping from regular meteorological data to soil water content within the NP-DWA mitigated the adverse effects of nonlinearity-related interference, which thus facilitated the identification of the soil moisture covariance matrix, especially the cross-covariance.