International Journal of General Medicine (Oct 2020)
Predicting Postoperative Length of Stay for Isolated Coronary Artery Bypass Graft Patients Using Machine Learning
Abstract
Fatima Alshakhs,1 Hana Alharthi,1 Nida Aslam,2 Irfan Ullah Khan,2 Mohamed Elasheri3 1Department of Health Information Management & Technology, College of Public Health, Imam Abdulrahman Bin Faisal University, Dammam 34221-4237, Saudi Arabia; 2Department of Computer Science, College of Computer Science and Information Technology, Imam Abdulrahman Bin Faisal University, Dammam 34221-4237, Saudi Arabia; 3Department of Cardiac Surgery, Saud Albabtain Cardiac Centre, Dammam 32245, Saudi ArabiaCorrespondence: Fatima AlshakhsImam Abdulrahman Bin Faisal University, College of Public Health, Health Information Management and Technology (HIMT), Dammam 34212, Saudi ArabiaTel +966 13 333 1309Email [email protected]: Predictive analytics (PA) is a new trending approach in the field of healthcare that uses machine learning to build a prediction model using supervised learning algorithms. Isolated coronary artery bypass grafting (iCABG), an open-heart surgery, is commonly performed in the treatment of coronary heart disease.Aim: The aim of this study was to develop and evaluate a model to predict postoperative length of stay (PLoS) for iCABG patients using supervised machine learning techniques, and to identify the features with the highest contribution to the model.Methods: This is a retrospective study that uses historic data of adult patients who underwent isolated CABG (iCABG). After initial data pre-processing, data imputation using the kNN method was applied. The study used five prediction models using Naïve Bayes, Decision Tree, Random Forest, Logistic Regression and k Nearest Neighbor algorithms. Data imbalance was managed using the following widely used methods: oversampling, undersampling, “Both”, and random over-sampling examples (ROSE). The features selection process was conducted using the Boruta method. Two techniques were applied to examine the performance of the models, (70%, 30%) split and cross-validation, respectively. Models were evaluated by comparing their performance using AUC and other metrics.Results: In the final dataset, six distinct features and 621 instances were used to develop the models. A total of 20 models were developed using R statistical software. The model generated using Random Forest with “Both” resampling method and cross-validation technique was deemed the best fit (AUC=0.81; F1 score=0.82; and recall=0.82). Attributes found to be highly predictive of PLoS were pulmonary artery systolic, age, height, EuroScore II, intra-aortic balloon pump used, and complications during operation.Conclusion: This study demonstrates the significance and effectiveness of building a model that predicts PLoS for iCABG patients using patient specifications and pre-/intra-operative measures.Keywords: predictive analytics, classifiers, CABG, LoS