A synthetic data set to benchmark anti-money laundering methods

Rasmus Ingemann Tuffveson Jensen; Joras Ferwerda; Kristian Sand Jørgensen; Erik Rathje Jensen; Martin Borg; Morten Persson Krogh; Jonas Brunholm Jensen; Alexandros Iosifidis

doi:10.1038/s41597-023-02569-2

Scientific Data (Sep 2023)

A synthetic data set to benchmark anti-money laundering methods

Rasmus Ingemann Tuffveson Jensen,
Joras Ferwerda,
Kristian Sand Jørgensen,
Erik Rathje Jensen,
Martin Borg,
Morten Persson Krogh,
Jonas Brunholm Jensen,
Alexandros Iosifidis

Affiliations

Rasmus Ingemann Tuffveson Jensen: Department of Electrical and Computer Engineering, Aarhus University
Joras Ferwerda: School of Economics, Utrecht University
Kristian Sand Jørgensen: Spar Nord Bank
Erik Rathje Jensen: Spar Nord Bank
Martin Borg: Spar Nord Bank
Morten Persson Krogh: Spar Nord Bank
Jonas Brunholm Jensen: Spar Nord Bank
Alexandros Iosifidis: Department of Electrical and Computer Engineering, Aarhus University

DOI: https://doi.org/10.1038/s41597-023-02569-2
Journal volume & issue: Vol. 10, no. 1
pp. 1 – 10

Abstract

Read online

Abstract Bank transactions are highly confidential. As a result, there are no real public data sets that can be used to investigate and compare anti-money laundering (AML) methods in banks. This severely limits research on important AML problems such as efficiency, effectiveness, class imbalance, concept drift, and interpretability. To address the issue, we present SynthAML: a synthetic data set to benchmark statistical and machine learning methods for AML. The data set builds on real data from Spar Nord, a systemically important Danish bank, and contains 20,000 AML alerts and over 16 million transactions. Experimental results indicate that performance on SynthAML can be transferred to the real world. As use cases, we present and discuss open problems in the AML literature.

Published in Scientific Data

ISSN: 2052-4463 (Online)
Publisher: Nature Portfolio
Country of publisher: United Kingdom
LCC subjects: Science
Website: https://www.nature.com/sdata/

About the journal