Data in Brief (Jun 2021)

Food packaging permeability and composition dataset dedicated to text-mining

  • Martin Lentschat,
  • Patrice Buche,
  • Juliette Dibie-Barthelemy,
  • Luc Menut,
  • Mathieu Roche

Journal volume & issue
Vol. 36
p. 107135

Abstract

Read online

This dataset is composed of symbolic and quantitative entities concerning food packaging composition and gas permeability. It was created from 50 scientific articles in English registered in html format from several international journals on the ScienceDirect website. The files were annotated independently by three experts on a WebAnno server. The aim of the annotation task was to recognize all entities related to packaging permeability measures and packaging composition. This annotation task is driven by an Ontological and Terminological Resource (OTR). An annotation guideline was designed in a collective and iterative approach involving the annotators. This dataset can be used to train or evaluate natural language processing (NLP) approaches in experimental fields, such as specialized entity recognition (e.g. terms and variations, units of measure, complex numerical values) or sentence level binary relation (e.g. value to unit, term to acronym).

Keywords