Journal of Translational Medicine (Sep 2019)

Discrimination of indoor versus outdoor environmental state with machine learning algorithms in myopia observational studies

  • Bin Ye,
  • Kangping Liu,
  • Siting Cao,
  • Padmaja Sankaridurg,
  • Wayne Li,
  • Mengli Luan,
  • Bo Zhang,
  • Jianfeng Zhu,
  • Haidong Zou,
  • Xun Xu,
  • Xiangui He

DOI
https://doi.org/10.1186/s12967-019-2057-2
Journal volume & issue
Vol. 17, no. 1
pp. 1 – 12

Abstract

Read online

Abstract Background Wearable smart watches provide large amount of real-time data on the environmental state of the users and are useful to determine risk factors for onset and progression of myopia. We aim to evaluate the efficacy of machine learning algorithm in differentiating indoor and outdoor locations as collected by use of smart watches. Methods Real time data on luminance, ultraviolet light levels and number of steps obtained with smart watches from dataset A: 12 adults from 8 scenes and manually recorded true locations. 70% of data was considered training set and support vector machine (SVM) algorithm generated using the variables to create a classification system. Data collected manually by the adults was the reference. The algorithm was used for predicting the location of the remaining 30% of dataset A. Accuracy was defined as the number of correct predictions divided by all. Similarly, data was corrected from dataset B: 172 children from 3 schools and 12 supervisors recorded true locations. Data collected by the supervisors was the reference. SVM model trained from dataset A was used to predict the location of dataset B for validation. Finally, we predicted the location of dataset B using the SVM model self-trained from dataset B. We repeated these three predictions with traditional univariate threshold segmentation method. Results In both datasets, SVM outperformed the univariate threshold segmentation method. In dataset A, the accuracy and AUC of SVM were 99.55% and 0.99 as compared to 95.11% and 0.95 with the univariate threshold segmentation (p < 0.01). In validation, the accuracy and AUC of SVM were 82.67% and 0.90 compared to 80.88% and 0.85 with the univariate threshold segmentation method (p < 0.01). In dataset B, the accuracy and AUC of SVM and AUC were 92.43% and 0.96 compared to 80.88% and 0.85 with the univariate threshold segmentation (p < 0.01). Conclusions Machine learning algorithm allows for discrimination of outdoor versus indoor environments with high accuracy and provides an opportunity to study and determine the role of environmental risk factors in onset and progression of myopia. The accuracy of machine learning algorithm could be improved if the model is trained with the dataset itself.

Keywords