IEEE Access (Jan 2019)

Classification of Cape Gooseberry Fruit According to its Level of Ripeness Using Machine Learning Techniques and Different Color Spaces

  • Wilson Castro,
  • Jimy Oblitas,
  • Miguel De-La-Torre,
  • Carlos Cotrina,
  • Karen Bazan,
  • Himer Avila-George

DOI
https://doi.org/10.1109/ACCESS.2019.2898223
Journal volume & issue
Vol. 7
pp. 27389 – 27400

Abstract

Read online

The classification of fresh fruits according to their visual ripeness is typically a subjective and tedious task; consequently, there is a growing interest in the use of non-contact techniques to automate this process. Machine learning techniques, such as artificial neural networks, support vector machines (SVMs), decision trees, and K-nearest neighbor algorithms, have been successfully applied for classification problems in the literature, particularly for images of fruit. However, the particularities of each classification problem make it difficult, if not impossible, to select a general technique that is applicable to all types of fruit. In this paper, the combinations of four machine learning techniques and three color spaces (RGB, HSV, and L*a*b*) were evaluated with regard to their ability to classify Cape gooseberry fruits. To this end, 925 Cape gooseberry fruit samples were collected, and each fruit was manually classified into one of seven different classes according to its level of ripeness. The color values of each fruit image in the three color spaces and their corresponding ripening stages were organized for training and validation following a fivefold cross-validation strategy in an iterative process repeated 100 times. According to the results, the classification of Cape gooseberry fruits by their ripeness level was sensitive to both the color space and the classification technique used. The models based on the L*a*b* color space and the SVM classifier showed the highest f-measure regardless of the color space, and the principal component analysis combination of color spaces improved the performance of the models at the expense of increased complexity.

Keywords