Frontiers in Nutrition (Jan 2022)

Multi-Dimensional Dataset of Open Data and Satellite Images for Characterization of Food Security and Nutrition

  • David S. Restrepo,
  • Luis E. Pérez,
  • Diego M. López,
  • Rubiel Vargas-Cañas,
  • Juan Sebastian Osorio-Valencia

DOI
https://doi.org/10.3389/fnut.2021.796082
Journal volume & issue
Vol. 8

Abstract

Read online

BackgroundNutrition is one of the main factors affecting the development and quality of life of a person. From a public health perspective, food security is an essential social determinant for promoting healthy nutrition. Food security embraces four dimensions: physical availability of food, economic and physical access to food, food utilization, and the sustainability of the dimensions above. Integrally addressing the four dimensions is vital. Surprisingly most of the works focused on a single dimension of food security: the physical availability of food.ObjectiveThe paper proposes a multi-dimensional dataset of open data and satellite images to characterize food security in the department of Cauca, Colombia.MethodsThe food security dataset integrates multiple open data sources; therefore, the Cross-Industry Standard Process for Data Mining methodology was used to guide the construction of the dataset. It includes sources such as population and agricultural census, nutrition surveys, and satellite images.ResultsAn open multidimensional dataset for the Department of Cauca with 926 attributes and 9 rows (each row representing a Municipality) from multiple sources in Colombia, is configured. Then, machine learning models were used to characterize food security and nutrition in the Cauca Department. As a result, The Food security index calculated for Cauca using a linear regression model (Mean Absolute Error of 0.391) is 57.444 in a range between 0 and 100, with 100 the best score. Also, an approach for extracting four features (Agriculture, Habitation, Road, Water) of satellite images were tested with the ResNet50 model trained from scratch, having the best performance with a macro-accuracy, macro-precision, macro-recall, and macro-F1-score of 91.7, 86.2, 66.91, and 74.92%, respectively.ConclusionIt shows how the CRISP-DM methodology can be used to create an open public health data repository. Furthermore, this methodology could be generalized to other types of problems requiring the creation of a dataset. In addition, the use of satellite images presents an alternative for places where data collection is challenging. The model and methodology proposed based on open data become a low-cost and effective solution that could be used by decision-makers, especially in developing countries, to support food security planning.

Keywords