Improving global soil moisture prediction through cluster-averaged sampling strategy

Qingliang Li; Qiyun Xiao; Cheng Zhang; Jinlong Zhu; Xiao Chen; Yuguang Yan; Pingping Liu; Wei Shangguan; Zhongwang Wei; Lu Li; Wenzong Dong; Yongjiu Dai

Geoderma (Sep 2024)

Improving global soil moisture prediction through cluster-averaged sampling strategy

Qingliang Li,
Qiyun Xiao,
Cheng Zhang,
Jinlong Zhu,
Xiao Chen,
Yuguang Yan,
Pingping Liu,
Wei Shangguan,
Zhongwang Wei,
Lu Li,
Wenzong Dong,
Yongjiu Dai

Affiliations

Qingliang Li: College of Computer Science and Technology, Changchun Normal University, Changchun 130032, China; Research Institute for Scientific and Technological Innovation, Changchun Normal University, Changchun 130032, China; Corresponding author at: College of Computer Science and Technology, Changchun Normal University, Changchun 130032, China.
Qiyun Xiao: College of Computer Science and Technology, Changchun Normal University, Changchun 130032, China
Cheng Zhang: College of Computer Science and Technology, Jilin University, Changchun 130032, China
Jinlong Zhu: College of Computer Science and Technology, Changchun Normal University, Changchun 130032, China; Research Institute for Scientific and Technological Innovation, Changchun Normal University, Changchun 130032, China
Xiao Chen: College of Computer Science and Technology, Changchun Normal University, Changchun 130032, China; Research Institute for Scientific and Technological Innovation, Changchun Normal University, Changchun 130032, China
Yuguang Yan: College of Computer Science and Technology, Changchun Normal University, Changchun 130032, China; Research Institute for Scientific and Technological Innovation, Changchun Normal University, Changchun 130032, China
Pingping Liu: College of Computer Science and Technology, Jilin University, Changchun 130032, China
Wei Shangguan: Southern Marine Science and Engineering Guangdong Laboratory (Zhuhai), and Guangdong Province Key Laboratory for Climate Change and Natural Disaster Studies, School of Atmospheric Sciences, Sun Yat-Sen University, Guangzhou, Guangdong, China
Zhongwang Wei: Southern Marine Science and Engineering Guangdong Laboratory (Zhuhai), and Guangdong Province Key Laboratory for Climate Change and Natural Disaster Studies, School of Atmospheric Sciences, Sun Yat-Sen University, Guangzhou, Guangdong, China
Lu Li: Southern Marine Science and Engineering Guangdong Laboratory (Zhuhai), and Guangdong Province Key Laboratory for Climate Change and Natural Disaster Studies, School of Atmospheric Sciences, Sun Yat-Sen University, Guangzhou, Guangdong, China
Wenzong Dong: Southern Marine Science and Engineering Guangdong Laboratory (Zhuhai), and Guangdong Province Key Laboratory for Climate Change and Natural Disaster Studies, School of Atmospheric Sciences, Sun Yat-Sen University, Guangzhou, Guangdong, China
Yongjiu Dai: Southern Marine Science and Engineering Guangdong Laboratory (Zhuhai), and Guangdong Province Key Laboratory for Climate Change and Natural Disaster Studies, School of Atmospheric Sciences, Sun Yat-Sen University, Guangzhou, Guangdong, China

Journal volume & issue: Vol. 449
p. 116999

Abstract

Read online

Understanding and predicting global soil moisture (SM) is crucial for water resource management and agricultural production. While deep learning methods (DL) have shown strong performance in SM prediction, imbalances in training samples with different characteristics pose a significant challenge. We propose that improving the diversity and balance of batch training samples during gradient descent can help address this issue. To test this hypothesis, we developed a Cluster-Averaged Sampling (CAS) strategy utilizing unsupervised learning techniques. This approach involves training the model with evenly sampled data from different clusters, ensuring both sample diversity and numerical consistency within each cluster. This approach prevents the model from overemphasizing specific sample characteristics, leading to more balanced feature learning. Experiments using the LandBench1.0 dataset with five different seeds for 1-day lead-time global predictions reveal that CAS outperforms several Long Short-Term Memory (LSTM)-based models that do not employ this strategy. The median Coefficient of Determination (R2) improved by 2.36 % to 4.31 %, while Kling-Gupta Efficiency (KGE) improved by 1.95 % to 3.16 %. In high-latitude areas, R2 improvements exceeded 40 % in specific regions. To further validate CAS under realistic conditions, we tested it using the Soil Moisture Active and Passive Level 3 (SMAP-L3) satellite data for 1 to 3-day lead-time global predictions, confirming its efficacy. The study substantiates the CAS strategy and introduces a novel training method for enhancing the generalization of DL models.

Published in Geoderma

ISSN: 1872-6259 (Online)
Publisher: Elsevier
Country of publisher: Netherlands
LCC subjects: Science
Website: https://www.sciencedirect.com/journal/geoderma

About the journal

Abstract

Keywords