IEEE Access (Jan 2024)
Application of Mask R-CNN and YOLOv8 Algorithms for Concrete Crack Detection
Abstract
The efficient and accurate detection of cracks in concrete structures is critical for maintaining structural integrity and safety. This study compares two state-of-the-art convolutional neural network (CNN) models, Mask R-CNN and YOLOv8, for automated concrete crack detection, each model representing two mainstream approaches for object detection and instance segmentation: single-stage and two-stage approach. We evaluate both models on 1,203 concrete images with 7:2:1 training, testing, and validation split, and assess their accuracy and processing speed. Mask R-CNN achieves a mean Intersection over Union (IoU) of 96.5% with a minimum IoU of 77% and higher consistency, compared to YOLOv8’s 90.6%, which often shows complete failure with IoU of 0%. In terms of computation speed, YOLOv8 shows 0.3225 s of average processing time per image, slightly outperforming the speed of Mask R-CNN, 0.4867 s. Despite YOLOv8’s faster processing speed, considering the characteristics of concrete crack detection tasks where accuracy should be prioritized over speed, Mask R-CNN seems a more proper model for reliable crack detection. We also show the accuracy of Mask R-CNN for crack detection tasks can be further enhanced by employing the ResNeXt backbone.
Keywords