Agronomy (Jul 2020)

Crop Yield Prediction through Proximal Sensing and Machine Learning Algorithms

  • Farhat Abbas,
  • Hassan Afzaal,
  • Aitazaz A. Farooque,
  • Skylar Tang

DOI
https://doi.org/10.3390/agronomy10071046
Journal volume & issue
Vol. 10, no. 7
p. 1046

Abstract

Read online

Proximal sensing techniques can potentially survey soil and crop variables responsible for variations in crop yield. The full potential of these precision agriculture technologies may be exploited in combination with innovative methods of data processing such as machine learning (ML) algorithms for the extraction of useful information responsible for controlling crop yield. Four ML algorithms, namely linear regression (LR), elastic net (EN), k-nearest neighbor (k-NN), and support vector regression (SVR), were used to predict potato (Solanum tuberosum) tuber yield from data of soil and crop properties collected through proximal sensing. Six fields in Atlantic Canada including three fields in Prince Edward Island (PE) and three fields in New Brunswick (NB) were sampled, over two (2017 and 2018) growing seasons, for soil electrical conductivity, soil moisture content, soil slope, normalized-difference vegetative index (NDVI), and soil chemistry. Data were collected from 39–40 30 × 30 m2 locations in each field, four times throughout the growing season, and yield samples were collected manually at the end of the growing season. Four datasets, namely PE-2017, PE-2018, NB-2017, and NB-2018, were then formed by combing data points from three fields to represent the province data for the respective years. Modeling techniques were employed to generate yield predictions assessed with different statistical parameters. The SVR models outperformed all other models for NB-2017, NB-2018, PE-2017, and PE-2018 dataset with RMSE of 5.97, 4.62, 6.60, and 6.17 t/ha, respectively. The performance of k-NN remained poor in three out of four datasets, namely NB-2017, NB-2018, and PE-2017 with RMSE of 6.93, 5.23, and 6.91 t/ha, respectively. The study also showed that large datasets are required to generate useful results using either model. This information is needed for creating site-specific management zones for potatoes, which form a significant component for food security initiatives across the globe.

Keywords