IEEE Access (Jan 2023)

FV-ViT: Vision Transformer for Finger Vein Recognition

  • Xiaoye Li,
  • Bin-Bin Zhang

DOI
https://doi.org/10.1109/ACCESS.2023.3297212
Journal volume & issue
Vol. 11
pp. 75451 – 75461

Abstract

Read online

Vision Transformer (ViT) has drawn the attention of many researchers in computer vision due to its superior performance in many computer vision tasks. However, there is limited research based on ViT models in finger vein recognition. This may be because the excellent performance of the ViT models relies on the abundance of training data, but finger vein databases are typically small. In this study, we focus on this question and proposed a model for finger vein recognition, referred to as FV-ViT. With only rigorous regularization added in the MLP head, called regMLP, instead of changing architecture in the ViT backbone, the proposed FV-ViT shows outstanding performance compared to other state-of-the-art works: 0.042% EER for FV-USM and 1.033% EER for SDUMLA-HMT. In addition, we also compare the baseline FV-ViT model with the corresponding ViT model trained with pretrained weights: 0.068% EER from non-pretrained FV-ViT base versus 0.116% EER from pretrained for FV-USM, 1.258% EER from non-pretrained FV-ViT base versus 1.022% EER from pretrained for SDUMLA-HMT. This means that the ViT models can be trained from scratch on finger vein databases and achieve comparable performance when compared to the pretrained model.

Keywords