Integrate MSRCR and Mask R-CNN to Recognize Underwater Creatures on Small Sample Datasets

Shaojian Song; Jingxu Zhu; Xiuhua Li; Qingbao Huang

doi:10.1109/ACCESS.2020.3025617

IEEE Access (Jan 2020)

Integrate MSRCR and Mask R-CNN to Recognize Underwater Creatures on Small Sample Datasets

Shaojian Song,
Jingxu Zhu,
Xiuhua Li,
Qingbao Huang

Affiliations

Shaojian Song: ORCiD; School of Electrical Engineering, Guangxi University, Nanning, China
Jingxu Zhu: School of Electrical Engineering, Guangxi University, Nanning, China
Xiuhua Li: School of Electrical Engineering, Guangxi University, Nanning, China
Qingbao Huang: School of Electrical Engineering, Guangxi University, Nanning, China

DOI: https://doi.org/10.1109/ACCESS.2020.3025617
Journal volume & issue: Vol. 8
pp. 172848 – 172858

Abstract

Read online

The poor quality of optical imaging caused by the complex and varying underwater environment is a significant challenge to underwater target recognition. Moreover, the insufficiency of relevant datasets may lead to the overfitting problem in target recognition models based on deep learning. Taking the instance segmentation of three underwater creatures (echinus, holothurian, and starfish) as an example, we propose a new method for recognition of underwater creatures. It combines the MSRCR (multi-scale Retinex with color restoration) image enhancement algorithm and the Mask R-CNN (region-based convolutional neural work) framework, and achieves a mAP (mean average accuracy) value higher than 90% on a small sample dataset. This method consists of three major steps. First, the dataset with 84 images is augmented (flip, adding noise, and GAN (generative adversarial networks)) to 430 images, and all images are enhanced with MSRCR to improve their qualities; Second, the model is pre-trained on the COCO (Microsoft common objects in context) dataset to shorten the training time and overcome overfitting; Finally, the pre-trained model is transferred to the underwater dataset, and the whole training process is completed. We achieve 97.46% precision and 94.52% recall, and the mAP (intersection over union (IOU) = 50) is 94.84%. The effectiveness of the proposed method is verified by comparing it with several popular target recognition models, including SSD (Single Shot Detector), YOLOv3 (You only look once), original Mask R-CNN, and a SIFT-based (Scale-invariant feature transform) model.

Published in IEEE Access

ISSN: 2169-3536 (Online)
Publisher: IEEE
Country of publisher: United States
LCC subjects: Technology: Electrical engineering. Electronics. Nuclear engineering
Website: https://ieeexplore.ieee.org/xpl/RecentIssue.jsp?punumber=6287639

About the journal

Abstract

Keywords