Frontiers in Nutrition (Mar 2024)

NutriGreen image dataset: a collection of annotated nutrition, organic, and vegan food products

  • Jan Drole,
  • Jan Drole,
  • Igor Pravst,
  • Igor Pravst,
  • Igor Pravst,
  • Tome Eftimov,
  • Barbara Koroušić Seljak,
  • Barbara Koroušić Seljak

DOI
https://doi.org/10.3389/fnut.2024.1342823
Journal volume & issue
Vol. 11

Abstract

Read online

IntroductionIn this research, we introduce the NutriGreen dataset, which is a collection of images representing branded food products aimed for training segmentation models for detecting various labels on food packaging. Each image in the dataset comes with three distinct labels: one indicating its nutritional quality using the Nutri-Score, another denoting whether it is vegan or vegetarian origin with the V-label, and a third displaying the EU organic certification (BIO) logo.MethodsTo create the dataset, we have used semi-automatic annotation pipeline that combines domain expert annotation and automatic annotation using a deep learning model.ResultsThe dataset comprises a total of 10,472 images. Among these, the Nutri-Score label is distributed across five sub-labels: Nutri-Score grade A with 1,250 images, grade B with 1,107 images, grade C with 867 images, grade D with 1,001 images, and grade E with 967 images. Additionally, there are 870 images featuring the V-Label, 2,328 images showcasing the BIO label, and 3,201 images without before-mentioned labels. Furthermore, we have fine-tuned the YOLOv5 segmentation model to demonstrate the practicality of using these annotated datasets, achieving an impressive accuracy of 94.0%.DiscussionThese promising results indicate that this dataset has significant potential for training innovative systems capable of detecting food labels. Moreover, it can serve as a valuable benchmark dataset for emerging computer vision systems.

Keywords