Scientific Reports (May 2021)

Supervised binary classification methods for strawberry ripeness discrimination from bioimpedance data

  • Pietro Ibba,
  • Christian Tronstad,
  • Roberto Moscetti,
  • Tanja Mimmo,
  • Giuseppe Cantarella,
  • Luisa Petti,
  • Ørjan G. Martinsen,
  • Stefano Cesco,
  • Paolo Lugli

DOI
https://doi.org/10.1038/s41598-021-90471-5
Journal volume & issue
Vol. 11, no. 1
pp. 1 – 13

Abstract

Read online

Abstract Strawberry is one of the most popular fruits in the market. To meet the demanding consumer and market quality standards, there is a strong need for an on-site, accurate and reliable grading system during the whole harvesting process. In this work, a total of 923 strawberry fruit were measured directly on-plant at different ripening stages by means of bioimpedance data, collected at frequencies between 20 Hz and 300 kHz. The fruit batch was then splitted in 2 classes (i.e. ripe and unripe) based on surface color data. Starting from these data, six of the most commonly used supervised machine learning classification techniques, i.e. Logistic Regression (LR), Binary Decision Trees (DT), Naive Bayes Classifiers (NBC), K-Nearest Neighbors (KNN), Support Vector Machine (SVM) and Multi-Layer Perceptron Networks (MLP), were employed, optimized, tested and compared in view of their performance in predicting the strawberry fruit ripening stage. Such models were trained to develop a complete feature selection and optimization pipeline, not yet available for bioimpedance data analysis of fruit. The classification results highlighted that, among all the tested methods, MLP networks had the best performances on the test set, with 0.72, 0.82 and 0.73 for the F $$_1$$ 1 , F $$_{0.5}$$ 0.5 and F $$_2$$ 2 -score, respectively, and improved the training results, showing good generalization capability, adapting well to new, previously unseen data. Consequently, the MLP models, trained with bioimpedance data, are a promising alternative for real-time estimation of strawberry ripeness directly on-field, which could be a potential application technique for evaluating the harvesting time management for farmers and producers.