PLoS ONE (Jan 2014)
Relative accuracy evaluation.
Abstract
The quality of data plays an important role in business analysis and decision making, and data accuracy is an important aspect in data quality. Thus one necessary task for data quality management is to evaluate the accuracy of the data. And in order to solve the problem that the accuracy of the whole data set is low while a useful part may be high, it is also necessary to evaluate the accuracy of the query results, called relative accuracy. However, as far as we know, neither measure nor effective methods for the accuracy evaluation methods are proposed. Motivated by this, for relative accuracy evaluation, we propose a systematic method. We design a relative accuracy evaluation framework for relational databases based on a new metric to measure the accuracy using statistics. We apply the methods to evaluate the precision and recall of basic queries, which show the result's relative accuracy. We also propose the method to handle data update and to improve accuracy evaluation using functional dependencies. Extensive experimental results show the effectiveness and efficiency of our proposed framework and algorithms.