Jisuanji kexue (Aug 2022)

RIIM:Real-Time Imputation Based on Individual Models

  • LI Xia, MA Qian, BAI Mei, WANG Xi-te, LI Guan-yu, NING Bo

DOI
https://doi.org/10.11896/jsjkx.210600180
Journal volume & issue
Vol. 49, no. 8
pp. 56 – 63

Abstract

Read online

With the enrichment of data sources,data can be obtained easily but with low quality,resulting that the MVs are ubi-quitous and hard to avoid.Consequently,MV imputation has become one of the classical problems in the field of data quality mana-gement.However,most existing MV imputation approaches are proposed for static data,which cannot handle dynamic data streams arriving at high-speed.Moreover,they do not consider data sparsity and heterogeneity simultaneously.Therefore,a novel MV imputation approach,real-time imputation based on individual models (RIIM) is proposed.In RIIM,the MVs are effectively filled by combining the basic ideas of neighbors-based imputation and regression-based imputation with consideration of sparsity and heterogeneity of data.For the dynamic and real time of data streams,the MV imputation model is updated incrementally.Moreover,an adaptive and periodic updating strategy for optimal neighbors search is proposed to solve the problem of high time cost and hard to determine the number of neighbors.Finally,the effectiveness of the proposed RIIM is evaluated based on extensive experiments over real-world datasets.

Keywords