Sensors (Feb 2023)

Can Satellites Predict Yield? Ensemble Machine Learning and Statistical Analysis of Sentinel-2 Imagery for Processing Tomato Yield Prediction

  • Nicoleta Darra,
  • Borja Espejo-Garcia,
  • Aikaterini Kasimati,
  • Olga Kriezi,
  • Emmanouil Psomiadis,
  • Spyros Fountas

DOI
https://doi.org/10.3390/s23052586
Journal volume & issue
Vol. 23, no. 5
p. 2586

Abstract

Read online

In this paper, we propose an innovative approach for robust prediction of processing tomato yield using open-source AutoML techniques and statistical analysis. Sentinel-2 satellite imagery was deployed to obtain values of five (5) selected vegetation indices (VIs) during the growing season of 2021 (April to September) at 5-day intervals. Actual recorded yields were collected across 108 fields, corresponding to a total area of 410.10 ha of processing tomato in central Greece, to assess the performance of Vis at different temporal scales. In addition, VIs were connected with the crop phenology to establish the annual dynamics of the crop. The highest Pearson coefficient (r) values occurred during a period of 80 to 90 days, indicating the strong relationship between the VIs and the yield. Specifically, RVI presented the highest correlation values of the growing season at 80 (r = 0.72) and 90 days (r = 0.75), while NDVI performed better at 85 days (r = 0.72). This output was confirmed by the AutoML technique, which also indicated the highest performance of the VIs during the same period, with the values of the adjusted R2 ranging from 0.60 to 0.72. The most precise results were obtained with the combination of ARD regression and SVR, which was the most successful combination for building an ensemble (adj. R2 = 0.67 ± 0.02).

Keywords