Journal of Medical Internet Research (Jul 2022)
Drug Recommendation System for Diabetes Using a Collaborative Filtering and Clustering Approach: Development and Performance Evaluation
Abstract
BackgroundDiabetes is a public health problem worldwide. Although diabetes is a chronic and incurable disease, measures and treatments can be taken to control it and keep the patient stable. Diabetes has been the subject of extensive research, ranging from disease prevention to the use of technologies for its diagnosis and control. Health institutions obtain information required for the diagnosis of diabetes through various tests, and appropriate treatment is provided according to the diagnosis. These institutions have databases with large volumes of information that can be analyzed and used in different applications such as pattern discovery and outcome prediction, which can help health personnel in making decisions about treatments or determining the appropriate prescriptions for diabetes management. ObjectiveThe aim of this study was to develop a drug recommendation system for patients with diabetes based on collaborative filtering and clustering techniques as a complement to the treatments given by the treating doctor. MethodsThe data set used contains information from patients with diabetes available in the University of California Irvine Machine Learning Repository. Data mining techniques were applied for processing and analysis of the data set. Unsupervised learning techniques were used for dimensionality reduction and patient clustering. Drug predictions were obtained with a user-based collaborative filtering approach, which enabled creating a patient profile that can be compared with the profiles of other patients with similar characteristics. Finally, recommendations were made considering the identified patient groups. The performance of the system was evaluated using metrics to assess the quality of the groups and the quality of the predictions and recommendations. ResultsPrincipal component analysis to reduce the dimensionality of the data showed that eight components best explained the variability of the data. We identified six groups of patients using the clustering algorithm, which were evenly distributed. These groups were identified based on the available information of patients with diabetes, and then the variation between groups was examined to predict a suitable medication for a target patient. The recommender system achieved good results in the quality of predictions with a mean squared error metric of 0.51 and accuracy in the quality of recommendations of 0.61, which is acceptable. ConclusionsThis work presents a recommendation system that suggests medications according to drug information and the characteristics of patients with diabetes. Some aspects related to this disease were analyzed based on the data set used from patients with diabetes. The experimental results with clustering and prediction techniques were found to be acceptable for the recommendation process. This system can provide a novel perspective for health institutions that require technologies to support health care personnel in the management of diabetes treatment and control.