Entropy (Feb 2022)

Missing Value Imputation Method for Multiclass Matrix Data Based on Closed Itemset

  • Mayu Tada,
  • Natsumi Suzuki,
  • Yoshifumi Okada

DOI
https://doi.org/10.3390/e24020286
Journal volume & issue
Vol. 24, no. 2
p. 286

Abstract

Read online

Handling missing values in matrix data is an important step in data analysis. To date, many methods to estimate missing values based on data pattern similarity have been proposed. Most previously proposed methods perform missing value imputation based on data trends over the entire feature space. However, individual missing values are likely to show similarity to data patterns in local feature space. In addition, most existing methods focus on single class data, while multiclass analysis is frequently required in various fields. Missing value imputation for multiclass data must consider the characteristics of each class. In this paper, we propose two methods based on closed itemsets, CIimpute and ICIimpute, to achieve missing value imputation using local feature space for multiclass matrix data. CIimpute estimates missing values using closed itemsets extracted from each class. ICIimpute is an improved method of CIimpute in which an attribute reduction process is introduced. Experimental results demonstrate that attribute reduction considerably reduces computational time and improves imputation accuracy. Furthermore, it is shown that, compared to existing methods, ICIimpute provides superior imputation accuracy but requires more computational time.

Keywords