A novel YOLOv5 network is presented in this paper to quantify the degree of defects in continuously cast billets. The proposed network addresses the challenges posed by noise or dirty spots and different defect sizes in the images of these billets. The CBAM-YOLOv5 network integrates the channel and spatial attention of the Convolutional Block Attention Module (CBAM) with the C3 layer of the YOLOv5 network structure to better fuse channel and spatial information, with focus on the defect target, and improve the network’s detection capability, particularly for different levels of segregation. As a result, the feature pyramid is improved. The feature map obtained after the fourth down-sampling of the backbone network is fed into the feature pyramid through CBAM to improve the perceptual field of the target and reduce information loss during the fusion process. Finally, a self-built dataset of continuously cast billets collected from different sources is used, and several experiments are conducted using this database. The experimental results show that the average accuracy (mAP) of the network is 93.7%, which can achieve intelligent rating.