Mathematics (Oct 2022)

High-Cardinality Categorical Attributes and Credit Card Fraud Detection

  • Emanuel Mineda Carneiro,
  • Carlos Henrique Quartucci Forster,
  • Lineu Fernando Stege Mialaret,
  • Luiz Alberto Vieira Dias,
  • Adilson Marques da Cunha

DOI
https://doi.org/10.3390/math10203808
Journal volume & issue
Vol. 10, no. 20
p. 3808

Abstract

Read online

Credit card transactions may contain some categorical attributes with large domains, involving up to hundreds of possible values, also known as high-cardinality attributes. The inclusion of such attributes makes analysis harder, due to results with poorer generalization and higher resource usage. A common practice is, therefore, to ignore such attributes, removing them, albeit wasting the information they provided. Contrariwise, this paper reports our findings on the positive impacts of using high-cardinality attributes on credit card fraud detection. Thus, we present a new algorithm for domain reduction that preserves the fraud-detection capabilities. Experiments applying a deep feedforward neural network on real datasets from a major Brazilian financial institution have shown that, when measured by the F-1 metric, the inclusion of such attributes does improve fraud-detection quality. As a main contribution, this proposed algorithm was able to reduce attribute cardinality, improving the training times of a model while preserving its predictive capabilities.

Keywords