Earth System Science Data (Nov 2022)
A 1 km daily soil moisture dataset over China using in situ measurement and machine learning
Abstract
High-quality gridded soil moisture products are essential for many Earth system science applications, while the recent reanalysis and remote sensing soil moisture data are often available at coarse resolution and remote sensing data are only for the surface soil. Here, we present a 1 km resolution long-term dataset of soil moisture derived through machine learning trained by the in situ measurements of 1789 stations over China, named SMCI1.0 (Soil Moisture of China by in situ data, version 1.0). Random forest is used as a robust machine learning approach to predict soil moisture using ERA5-Land time series, leaf area index, land cover type, topography and soil properties as predictors. SMCI1.0 provides 10-layer soil moisture with 10 cm intervals up to 100 cm deep at daily resolution over the period 2000–2020. Using in situ soil moisture as the benchmark, two independent experiments were conducted to evaluate the estimation accuracy of SMCI1.0: year-to-year (ubRMSE ranges from 0.041 to 0.052 and R ranges from 0.883 to 0.919) and station-to-station experiments (ubRMSE ranges from 0.045 to 0.051 and R ranges from 0.866 to 0.893). SMCI1.0 generally has advantages over other gridded soil moisture products, including ERA5-Land, SMAP-L4, and SoMo.ml. However, the high errors of soil moisture are often located in the North China Monsoon Region. Overall, the highly accurate estimations of both the year-to-year and station-to-station experiments ensure the applicability of SMCI1.0 to study the spatial–temporal patterns. As SMCI1.0 is based on in situ data, it can be a useful complement to existing model-based and satellite-based soil moisture datasets for various hydrological, meteorological, and ecological analyses and models. The DOI link for the dataset is http://dx.doi.org/10.11888/Terre.tpdc.272415 (Shangguan et al., 2022).