Machine Learning and Knowledge Extraction (Mar 2024)

Classification, Regression, and Survival Rule Induction with Complex and M-of-N Elementary Conditions

  • Cezary Maszczyk,
  • Marek Sikora,
  • Łukasz Wróbel

DOI
https://doi.org/10.3390/make6010026
Journal volume & issue
Vol. 6, no. 1
pp. 554 – 579

Abstract

Read online

Most rule induction algorithms generate rules with simple logical conditions based on equality or inequality relations. This feature limits their ability to discover complex dependencies that may exist in data. This article presents an extension to the sequential covering rule induction algorithm that allows it to generate complex and M-of-N conditions within the premises of rules. The proposed methodology uncovers complex patterns in data that are not adequately expressed by rules with simple conditions. The novel two-phase approach efficiently generates M-of-N conditions by analysing frequent sets in previously induced simple and complex rule conditions. The presented method allows rule induction for classification, regression and survival problems. Extensive experiments on various public datasets show that the proposed method often leads to more concise rulesets compared to those using only simple conditions. Importantly, the inclusion of complex conditions and M-of-N conditions has no statistically significant negative impact on the predictive ability of the ruleset. Experimental results and a ready-to-use implementation are available in the GitHub repository. The proposed algorithm can potentially serve as a valuable tool for knowledge discovery and facilitate the interpretation of rule-based models by making them more concise.

Keywords