Artificial Intelligence in Agriculture (Jan 2021)
Combining machine learning, space-time cloud restoration and phenology for farm-level wheat yield prediction
Abstract
Though studies showed the potential of high-resolution optical sensors for crop yield prediction, several factors have limited their wider application. The main factors are obstruction of cloud, identification of phenology, demand for high computing infrastructure and the complexity of statistical methods. In this research, we created a novel approach by combining four methods. First, we implemented the cloud restoration algorithm called gapfill to restore missed Normalized Difference Vegetation Index (NDVI) values derived from Sentinel-2 sensor (S2) due to cloud obstruction. Second, we created square tiles as a solution for high computing infrastructure demand due to the use of high-resolution sensor. Third, we implemented gapfill following critical crop phenology stage. Fourth, observations from restored images combined with original (from cloud-free images) values and applied for winter wheat prediction. We applied seven base machine learning as well as two groups of super learning ensembles. The study successfully applied gapfill on high-resolution image to get good quality estimates for cloudy pixels. Consequently, yield prediction accuracy increased due to the incorporation of restored values in the regression process. Base models such as Generalized Linear Regression (GLM) and Random Forest (RF) showed improved capacity compared to other base and ensemble models. The two models revealed RMSE of 0.001 t/ha and 0.136 t/ha on the holdout group. The two models also revealed consistent and better performance using scatter plot analysis across three datasets. The approach developed is useful to predict wheat yield at field scale, which is a rarely available but vital in many developmental projects, using optical sensors.