IEEE Access (Jan 2021)
A Mass-Based Approach for Local Outlier Detection
Abstract
This paper proposes a new outlier detection approach that measures the degree of outlierness for each instance in a given dataset. The proposed model utilizes a mass-based dissimilarity measure to address the weaknesses of neighbor-based outlier models while detecting local outliers in the dataset within a variety of data point densities. In particular, it first applies a hierarchical partitioning technique to generate a set of tree-like nested structure partitions for the input dataset, and then a mass-based dissimilarity measure is defined to quantify the dissimilarity between two data instances given the generated hierarchical partition structure. After that, for each data instance, a context set is obtained by gathering the neighbors around it with the k lowest mass dissimilarities, and based on those context sets, a mass-based local outlier score model is introduced to compute the outlierness for each individual instance. The proposed approach fundamentally changes the perspective of the outlier model by using the mass-based measurement instead of the distance-based functions used in most neighbor-based methods. A comprehensive experiment conducted on both synthetic and real-world datasets demonstrates that the proposed approach is not only competitive with the existing state-of-the-art outlier detection models but is also an efficient and effective alternative for local outlier detection methods.
Keywords