JMIR mHealth and uHealth (Mar 2020)

Volumetric Food Quantification Using Computer Vision on a Depth-Sensing Smartphone: Preclinical Study

  • Herzig, David,
  • Nakas, Christos T,
  • Stalder, Janine,
  • Kosinski, Christophe,
  • Laesser, Céline,
  • Dehais, Joachim,
  • Jaeggi, Raphael,
  • Leichtle, Alexander Benedikt,
  • Dahlweid, Fried-Michael,
  • Stettler, Christoph,
  • Bally, Lia

DOI
https://doi.org/10.2196/15294
Journal volume & issue
Vol. 8, no. 3
p. e15294

Abstract

Read online

BackgroundQuantification of dietary intake is key to the prevention and management of numerous metabolic disorders. Conventional approaches are challenging, laborious, and lack accuracy. The recent advent of depth-sensing smartphones in conjunction with computer vision could facilitate reliable quantification of food intake. ObjectiveThe objective of this study was to evaluate the accuracy of a novel smartphone app combining depth-sensing hardware with computer vision to quantify meal macronutrient content using volumetry. MethodsThe app ran on a smartphone with a built-in depth sensor applying structured light (iPhone X). The app estimated weight, macronutrient (carbohydrate, protein, fat), and energy content of 48 randomly chosen meals (breakfasts, cooked meals, snacks) encompassing 128 food items. The reference weight was generated by weighing individual food items using a precision scale. The study endpoints were (1) error of estimated meal weight, (2) error of estimated meal macronutrient content and energy content, (3) segmentation performance, and (4) processing time. ResultsIn both absolute and relative terms, the mean (SD) absolute errors of the app’s estimates were 35.1 g (42.8 g; relative absolute error: 14.0% [12.2%]) for weight; 5.5 g (5.1 g; relative absolute error: 14.8% [10.9%]) for carbohydrate content; 1.3 g (1.7 g; relative absolute error: 12.3% [12.8%]) for fat content; 2.4 g (5.6 g; relative absolute error: 13.0% [13.8%]) for protein content; and 41.2 kcal (42.5 kcal; relative absolute error: 12.7% [10.8%]) for energy content. Although estimation accuracy was not affected by the viewing angle, the type of meal mattered, with slightly worse performance for cooked meals than for breakfasts and snacks. Segmentation adjustment was required for 7 of the 128 items. Mean (SD) processing time across all meals was 22.9 seconds (8.6 seconds). ConclusionsThis study evaluated the accuracy of a novel smartphone app with an integrated depth-sensing camera and found highly accurate volume estimation across a broad range of food items. In addition, the system demonstrated high segmentation performance and low processing time, highlighting its usability.