IEEE Access (Jan 2020)
A Predictive Performance Analysis of Vitamin D Deficiency Severity Using Machine Learning Methods
Abstract
Vitamin D Deficiency (VDD) is one of the most significant global health problem and there is a strong demand for the prediction of its severity using non-invasive methods. The primary data containing serum Vitamin D levels were collected from a total of 3044 college students between 18-21 years of age. The independent parameters like age, sex, weight, height, body mass index (BMI), waist circumference, body fat, bone mass, exercise, sunlight exposure, and milk consumption were used for prediction of VDD. The study aims to compare and evaluate different machine learning models in the prediction of severity in VDD. The objectives of our approach are to apply various powerful machine learning algorithms in prediction and evaluate the results with different performance measures like Precision, Recall, F1-measure, Accuracy, and Area under the curve of receiver operating characteristic (ROC). The McNemar's test was conducted to validate the empirical results which is a statistical test. The final objective is to identify the best machine learning classifier in the prediction of the severity of VDD. The most popular and powerful machine learning classifiers like K-Nearest Neighbor (KNN), Decision Tree (DT), Random Forest (RF), AdaBoost (AB), Bagging Classifier (BC), ExtraTrees (ET), Stochastic Gradient Descent (SGD), Gradient Boosting (GB), Support Vector Machine (SVM), and Multi-Layer Perceptron (MLP) were implemented to predict the severity of VDD. The final experimentation results showed that the Random Forest Classifier achieves better accuracy of 96% and outperforms well on training and testing Vitamin D dataset. The McNemar's statistical test results support that the RF classifier outperforms than the other classifiers.
Keywords