Frontiers in Marine Science (Aug 2022)
Accelerating Species Recognition and Labelling of Fish From Underwater Video With Machine-Assisted Deep Learning
Abstract
Machine-assisted object detection and classification of fish species from Baited Remote Underwater Video Station (BRUVS) surveys using deep learning algorithms presents an opportunity for optimising analysis time and rapid reporting of marine ecosystem statuses. Training object detection algorithms for BRUVS analysis presents significant challenges: the model requires training datasets with bounding boxes already applied identifying the location of all fish individuals in a scene, and it requires training datasets identifying species with labels. In both cases, substantial volumes of data are required and this is currently a manual, labour-intensive process, resulting in a paucity of the labelled data currently required for training object detection models for species detection. Here, we present a “machine-assisted” approach for i) a generalised model to automate the application of bounding boxes to any underwater environment containing fish and ii) fish detection and classification to species identification level, up to 12 target species. A catch-all “fish” classification is applied to fish individuals that remain unidentified due to a lack of available training and validation data. Machine-assisted bounding box annotation was shown to detect and label fish on out-of-sample datasets with a recall between 0.70 and 0.89 and automated labelling of 12 targeted species with an F1 score of 0.79. On average, 12% of fish were given a bounding box with species labels and 88% of fish were located and given a fish label and identified for manual labelling. Taking a combined, machine-assisted approach presents a significant advancement towards the applied use of deep learning for fish species detection in fish analysis and workflows and has potential for future fish ecologist uptake if integrated into video analysis software. Manual labelling and classification effort is still required, and a community effort to address the limitation presented by a severe paucity of training data would improve automation accuracy and encourage increased uptake.
Keywords