IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing (Jan 2024)
Downscaling Administrative-Level Crop Yield Statistics to 1 km Grids Using Multisource Remote Sensing Data and Ensemble Machine Learning
Abstract
The United States (U.S.) is a global leader in the production and exportation of soybeans and corn. Accurate monitoring and estimation of soybean and corn yields in the U.S. is essential for improving global food security. However, there is currently a lack of publicly available spatial distribution datasets with high temporal and spatial resolution for U.S. corn and soybean yields, which hampers related research and policy-making. Therefore, in this study, we proposed a statistical downscaling framework to produce spatially explicit crop yield estimates by utilizing multisource environmental covariates and ensemble machine learning methods. We produced distribution maps of soybean and corn yields in the U.S. from 2006 to 2021 at a 1-km resolution through the optimal Cubist model, resulting in the USASoy&CornYield1km dataset. The results demonstrated stable accuracy, with R2 values for corn ranging from 0.70 to 0.89 (average of 0.80) and for soybeans ranging from 0.74 to 0.90 (average of 0.81) during the period 2006–2021. Comparison with the spatial production allocation model (SPAM) dataset further confirmed the reliability of this dataset, with correlations of 0.84 for soybeans and 0.78 for corn when compared to SPAM2010. Spatial uncertainty analysis showed that the yield estimation uncertainty was 14.04% for soybeans and 20.49% for corn, indicating a generally low level of uncertainty. Overall, the USASoy&CornYield1km dataset offers higher spatial and temporal resolution, captures yield variations within counties, and covers a long time span. This study provides significant insights for analyzing U.S. soybean and corn yields and improving agricultural production.
Keywords