Journal of King Saud University: Computer and Information Sciences (Jul 2022)
ILA4: Overcoming missing values in machine learning datasets – An inductive learning approach
Abstract
This article introduces ILA4: A new algorithm designed to handle datasets with missing values. ILA4 is inspired by a series of ILA algorithms which also handle missing data with further enhancements. ILA4 is applied to datasets with varying completeness and also compared to other, known approaches for handling datasets with missing values. In the majority of cases, ILA4 produced favorable performance that is on a par with many established approaches for treating missing values including algorithms that are based on the Most Common Value (MCV), the Most Common Value Restricted to a Concept (MCVRC), and those that utilize the Delete strategy. ILA4 was also compared with three known algorithms namely: Logistic Regression, Naïve Bayes, and Random Forest; the accuracy obtained by ILA4 is comparable or better than the best results obtained from these three algorithms.