IEEE Access (Jan 2020)

Machine Learning Approach for Postprandial Blood Glucose Prediction in Gestational Diabetes Mellitus

  • Evgenii A. Pustozerov,
  • Aleksandra S. Tkachuk,
  • Elena A. Vasukova,
  • Anna D. Anopova,
  • Maria A. Kokina,
  • Inga V. Gorelova,
  • Tatiana M. Pervunina,
  • Elena N. Grineva,
  • Polina V. Popova

DOI
https://doi.org/10.1109/ACCESS.2020.3042483
Journal volume & issue
Vol. 8
pp. 219308 – 219321

Abstract

Read online

Postprandial blood glucose prediction is a crucial part of diabetes management. Recently, this topic has been of great interest, resulting in many research projects and published papers. Although different input parameters that might be beneficial for blood glucose prediction models were comprehensively discussed, specific data preprocessing, feature engineering and model tuning steps were not explained in detail in many of these papers. In this work, we developed and comprehensively described a data-driven blood glucose model based on a decision tree gradient boosting algorithm to predict different characteristics of postprandial glycemic responses; the model utilized meal-related data derived from a mobile app diary (including information on the glycemic index), food context (information on previous meals), characteristics of the individual patients and patient behavioral questionnaires. A set of rules was defined and implemented to detect incorrect meal records and to filter faulty data, and analyses were conducted on the overall food diary data and in particular, the data on the current meal for which the postprandial blood glucose response was calculated. Different gradient boosting models were trained and evaluated with parameters selected via random search cross-validation. The best models for the prediction of the incremental area under the blood glucose curve two hours after food intake had the following characteristics: R = 0.631, MAE = 0.373 mmol/L*h for the model not using data on current blood glucose; R = 0.644, MAE = 0.371 mmol/L*h for the model using data on the current blood glucose levels; and R = 0.704, MAE = 0.341 mmol/L*h for the model utilizing data on the continuous blood glucose trends before the meal. The impact of features was evaluated using Shapley values. The meal glycemic load, amount of carbohydrates in the meal, type of meal (e.g., breakfast), amount of starch and amount of food consumed 6 hours before the current meal were the most important contributors in the models.

Keywords