High-Cardinality Categorical Attributes and Credit Card Fraud Detection

Emanuel Mineda Carneiro; Carlos Henrique Quartucci Forster; Lineu Fernando Stege Mialaret; Luiz Alberto Vieira Dias; Adilson Marques da Cunha

doi:10.3390/math10203808

Mathematics (Oct 2022)

High-Cardinality Categorical Attributes and Credit Card Fraud Detection

Emanuel Mineda Carneiro,
Carlos Henrique Quartucci Forster,
Lineu Fernando Stege Mialaret,
Luiz Alberto Vieira Dias,
Adilson Marques da Cunha

Affiliations

Emanuel Mineda Carneiro: Sao Paulo State Technological College (Faculdade de Tecnologia—Fatec), Sao Jose dos Campos 12247-014, Brazil
Carlos Henrique Quartucci Forster: Brazilian Aeronautics Institute of Technology (Instituto Tecnologico de Aeronautica—ITA), Sao Jose dos Campos 12228-900, Brazil
Lineu Fernando Stege Mialaret: Federal Institute of Education, Science and Technology of Sao Paulo (Instituto Federal de Sao Paulo—IFSP), Jacarei 12322-030, Brazil
Luiz Alberto Vieira Dias: Brazilian Aeronautics Institute of Technology (Instituto Tecnologico de Aeronautica—ITA), Sao Jose dos Campos 12228-900, Brazil
Adilson Marques da Cunha: Brazilian Aeronautics Institute of Technology (Instituto Tecnologico de Aeronautica—ITA), Sao Jose dos Campos 12228-900, Brazil

DOI: https://doi.org/10.3390/math10203808
Journal volume & issue: Vol. 10, no. 20
p. 3808

Abstract

Read online

Credit card transactions may contain some categorical attributes with large domains, involving up to hundreds of possible values, also known as high-cardinality attributes. The inclusion of such attributes makes analysis harder, due to results with poorer generalization and higher resource usage. A common practice is, therefore, to ignore such attributes, removing them, albeit wasting the information they provided. Contrariwise, this paper reports our findings on the positive impacts of using high-cardinality attributes on credit card fraud detection. Thus, we present a new algorithm for domain reduction that preserves the fraud-detection capabilities. Experiments applying a deep feedforward neural network on real datasets from a major Brazilian financial institution have shown that, when measured by the F-1 metric, the inclusion of such attributes does improve fraud-detection quality. As a main contribution, this proposed algorithm was able to reduce attribute cardinality, improving the training times of a model while preserving its predictive capabilities.

Published in Mathematics

ISSN: 2227-7390 (Online)
Publisher: MDPI AG
Country of publisher: Switzerland
LCC subjects: Science: Mathematics
Website: http://www.mdpi.com/journal/mathematics

About the journal

Abstract

Keywords