Data in Brief (Dec 2022)

A behaviour biometrics dataset for user identification and authentication

  • Nonso Nnamoko,
  • Joseph Barrowclough,
  • Mark Liptrott,
  • Ioannis Korkontzelos

Journal volume & issue
Vol. 45
p. 108728

Abstract

Read online

As e-Commerce continues to shift our shopping preference from the physical to online marketplace, we leave behind digital traces of our personally identifiable details. For example, the merchant keeps record of your name and address; the payment processor stores your transaction details including account or card information, and every website you visit stores other information such as your device address and type. Cybercriminals constantly steal and use some of this information to commit identity fraud, ultimately leading to devastating consequences to the victims; but also, to the card issuers and payment processors with whom the financial liability most often lies. To this end, we recognise that data is generally compromised in this digital age, and personal data such as card number, password, personal identification number and account details can be easily stolen and used by someone else. However, there is a plethora of data relating to a person's behaviour biometrics that are almost impossible to steal, such as the way they type on a keyboard, move the cursor, or whether they normally do so via a mouse, touchpad or trackball. This data, commonly called keystroke, mouse and touchscreen dynamics, can be used to create a unique profile for the legitimate card owner, that can be utilised as an additional layer of user authentication during online card payments. Machine learning is a powerful technique for analysing such data to gain knowledge; and has been widely used successfully in many sectors for profiling e.g., genome classification in molecular biology and genetics where predictions are made for one or more forms of biochemical activity along the genome. Similar techniques are applicable in the financial sector to detect anomaly in user keyboard and mouse behaviour when entering card details online, such that they can be used to distinguish between a legitimate and an illegitimate card owner. In this article, a behaviour biometrics (i.e., keystroke and mouse dynamics) dataset, collected from 88 individuals, is presented. The dataset holds a total of 1760 instances categorised into two classes (i.e., legitimate and illegitimate card owners’ behaviour). The data was collected to facilitate an academic start-up project (called CyberSignature1) which received funding from Innovate UK, under the Cyber Security Academic Startup Accelerator Programme. The dataset could be helpful to researchers who apply machine learning to develop applications using keystroke and mouse dynamics e.g., in cybersecurity to prevent identity theft. The dataset, entitled ‘Behaviour Biometrics Dataset’, is freely available on the Mendeley Data repository.

Keywords