Fish-TViT: A novel fish species classification method in multi water areas based on transfer learning and vision transformer
Bo Gong,
Kanyuan Dai,
Ji Shao,
Ling Jing,
Yingyi Chen
Affiliations
Bo Gong
College of Information and Electrical Engineering, China Agricultural University, 100083, Beijing, China; National Innovation Center for Digital Fishery, China Agricultural University, Beijing, 100083, China; Key Laboratory of Smart Farming Technologies for Aquatic Animal and Livestock, Ministry of Agriculture and Rural Affairs, Beijing, 100083, China
Kanyuan Dai
Key Laboratory of Smart Farming Technologies for Aquatic Animal and Livestock, Ministry of Agriculture and Rural Affairs, Beijing, 100083, China; Beijing Engineering and Technology Research Center for Internet of Things in Agriculture, China Agricultural University, Beijing, 100083, China; College of Science, China Agricultural University, Beijing, 100083, China
Ji Shao
College of Information and Electrical Engineering, China Agricultural University, 100083, Beijing, China; National Innovation Center for Digital Fishery, China Agricultural University, Beijing, 100083, China; Beijing Engineering and Technology Research Center for Internet of Things in Agriculture, China Agricultural University, Beijing, 100083, China
Ling Jing
College of Information and Electrical Engineering, China Agricultural University, 100083, Beijing, China; National Innovation Center for Digital Fishery, China Agricultural University, Beijing, 100083, China; Key Laboratory of Smart Farming Technologies for Aquatic Animal and Livestock, Ministry of Agriculture and Rural Affairs, Beijing, 100083, China; Beijing Engineering and Technology Research Center for Internet of Things in Agriculture, China Agricultural University, Beijing, 100083, China; College of Science, China Agricultural University, Beijing, 100083, China; Corresponding authors at: National Innovation Center for Digital Fishery, China Agricultural University, Beijing, 100083, China.
Yingyi Chen
College of Information and Electrical Engineering, China Agricultural University, 100083, Beijing, China; National Innovation Center for Digital Fishery, China Agricultural University, Beijing, 100083, China; Key Laboratory of Smart Farming Technologies for Aquatic Animal and Livestock, Ministry of Agriculture and Rural Affairs, Beijing, 100083, China; Beijing Engineering and Technology Research Center for Internet of Things in Agriculture, China Agricultural University, Beijing, 100083, China; Corresponding authors at: National Innovation Center for Digital Fishery, China Agricultural University, Beijing, 100083, China.
The classification of fish species has important practical significance for both the aquaculture industry and ordinary people. However, existing methods for classifying marine and freshwater fishes have poor feature extraction ability and do not meet actual needs. To address this issue, we propose a novel method for multi-water fish classification (Fish-TViT) based on transfer learning and visual transformers. Fish-TViT uses a label smoothing loss function to solve the problem of overfitting and overconfidence of the classifier. We also employ Gradient-weighted Category Activation Mapping (Grad-CAM) technology to visualize and understand the features of the model and the areas on which the decision depends, which guides the optimization of the model architecture. We first crop and clean fish images, and then use data augmentation to expand the number of training datasets. A pre-trained visual transformer model is used to extract enhanced features of fish images, which are subsequently cropped into a series of flat patches. Finally, a multi-layer perceptron is used to predict fish species. Experimental results show that Fish-TViT achieves high classification accuracy on both low-resolution marine fish data (94.33%) and high-resolution freshwater fish data (98.34%). Compared with traditional convolutional neural networks, Fish-TViT has better performance.