IEEE Access (Jan 2020)

Exceptional in so Many Ways—Discovering Descriptors That Display Exceptional Behavior on Contrasting Scenarios

  • Jose Maria Luna,
  • Mykola Pechenizkiy,
  • Wouter Duivesteijn,
  • Sebastian Ventura

DOI
https://doi.org/10.1109/ACCESS.2020.3034885
Journal volume & issue
Vol. 8
pp. 200982 – 200994

Abstract

Read online

The current state of the art in supervised descriptive pattern mining is very good in automatically finding subsets of the dataset at hand that are exceptional in some sense. The most common form, subgroup discovery, generally finds subgroups where a single target variable has an unusual distribution. Exceptional model mining (EMM) typically finds subgroups where a pair of target variables display an unusual interaction. What these methods have in common is that one specific exceptionality is enough to flag up a subgroup as exceptional. This, however, naturally leads to the question: can we also find multiple instances of exceptional behaviour simultaneously in the same subgroup? This paper provides a first, affirmative answer to that question in the form of the SPEC (Subsets of Pairwise Exceptional Correlations) model class for EMM. Given a set of predefined numeric target variables, SPEC will flag up subgroups as interesting if multiple target pairs display an unusual rank correlation. This is a fundamental extension of the EMM toolbox, which comes with additional algorithmic challenges. To address these challenges, we provide a series of algorithmic solutions whose strengths/flaws are empirically analysed.

Keywords