NutriGreen image dataset: a collection of annotated nutrition, organic, and vegan food products

Jan Drole; Jan Drole; Igor Pravst; Igor Pravst; Igor Pravst; Tome Eftimov; Barbara Koroušić Seljak; Barbara Koroušić Seljak

doi:10.3389/fnut.2024.1342823

Frontiers in Nutrition (Mar 2024)

NutriGreen image dataset: a collection of annotated nutrition, organic, and vegan food products

Jan Drole,
Jan Drole,
Igor Pravst,
Igor Pravst,
Igor Pravst,
Tome Eftimov,
Barbara Koroušić Seljak,
Barbara Koroušić Seljak

Affiliations

Jan Drole: Faculty of Computer and Information Science, University of Ljubljana, Ljubljana, Slovenia
Jan Drole: Computer Systems Department, Jožef Stefan Institute, Ljubljana, Slovenia
Igor Pravst: Nutrition Institute, Ljubljana, Slovenia
Igor Pravst: Biotechnical Faculty, University of Ljubljana, Ljubljana, Slovenia
Igor Pravst: VIST–Faculty of Applied Sciences, Ljubljana, Slovenia
Tome Eftimov: Computer Systems Department, Jožef Stefan Institute, Ljubljana, Slovenia
Barbara Koroušić Seljak: Computer Systems Department, Jožef Stefan Institute, Ljubljana, Slovenia
Barbara Koroušić Seljak: Jožef Stefan International Postgraduate School, Ljubljana, Slovenia

DOI: https://doi.org/10.3389/fnut.2024.1342823
Journal volume & issue: Vol. 11

Abstract

Read online

IntroductionIn this research, we introduce the NutriGreen dataset, which is a collection of images representing branded food products aimed for training segmentation models for detecting various labels on food packaging. Each image in the dataset comes with three distinct labels: one indicating its nutritional quality using the Nutri-Score, another denoting whether it is vegan or vegetarian origin with the V-label, and a third displaying the EU organic certification (BIO) logo.MethodsTo create the dataset, we have used semi-automatic annotation pipeline that combines domain expert annotation and automatic annotation using a deep learning model.ResultsThe dataset comprises a total of 10,472 images. Among these, the Nutri-Score label is distributed across five sub-labels: Nutri-Score grade A with 1,250 images, grade B with 1,107 images, grade C with 867 images, grade D with 1,001 images, and grade E with 967 images. Additionally, there are 870 images featuring the V-Label, 2,328 images showcasing the BIO label, and 3,201 images without before-mentioned labels. Furthermore, we have fine-tuned the YOLOv5 segmentation model to demonstrate the practicality of using these annotated datasets, achieving an impressive accuracy of 94.0%.DiscussionThese promising results indicate that this dataset has significant potential for training innovative systems capable of detecting food labels. Moreover, it can serve as a valuable benchmark dataset for emerging computer vision systems.

Published in Frontiers in Nutrition

ISSN: 2296-861X (Online)
Publisher: Frontiers Media S.A.
Country of publisher: Switzerland
LCC subjects: Technology: Home economics: Nutrition. Foods and food supply
Website: https://www.frontiersin.org/journals/nutrition/

About the journal

Abstract

Keywords