Datasets for training and validating a deep learning-based system to detect microfossil fish teeth from slide images
Kazuhide Mimura,
Kentaro Nakamura
Affiliations
Kazuhide Mimura
Ocean Resources Research Center for Next Generation, Chiba Institute of Technology, 2-17-1 Tsudanuma, Narashino, Chiba 275-0016, Japan; Frontier Research Center for Energy and Resources, School of Engineering, The University of Tokyo, 7-3-1 Hongo, Bunkyo-ku, Tokyo 113-8656, Japan
Kentaro Nakamura
Frontier Research Center for Energy and Resources, School of Engineering, The University of Tokyo, 7-3-1 Hongo, Bunkyo-ku, Tokyo 113-8656, Japan; Department of Systems Innovation, School of Engineering, The University of Tokyo, 7-3-1 Hongo, Bunkyo-ku, Tokyo 113-8656, Japan; Ocean Resources Research Center for Next Generation, Chiba Institute of Technology, 2-17-1 Tsudanuma, Narashino, Chiba 275-0016, Japan; Corresponding author at: Frontier Research Center for Energy and Resources, School of Engineering, The University of Tokyo, 7-3-1 Hongo, Bunkyo-ku, Tokyo 113-8656, Japan.
In this paper, we describe the three datasets that were used to train, validate, and test deep learning models to detect microfossil fish teeth. The first dataset was created for training and validating a Mask R-CNN model to detect fish teeth in the images taken using the microscope. The training set contained 866 images and one annotation file; the validation set contained 92 images and one annotation file. The second dataset was created for training and validating EfficientNet-V2 models; it included 17,400 images of teeth and 15,036 images that contained only noise (particles other than teeth). The third dataset was created to evaluate the performance of a system that combines a Mask R-CNN model and an EfficientNet-V2 model; it contained 5177 images with annotation files for the locations of 431 teeth within the images.