Applied Sciences (Oct 2024)
Enhancing a You Only Look Once-Plated Detector via Auxiliary Textual Coding for Multi-Scale Rotating Remote Sensing Objects in Transportation Monitoring Applications
Abstract
With the rapid development of intelligent information technologies, remote sensing object detection has played an important role in different field applications. Particularly in recent years, it has attracted widespread attention in assisting with food safety supervision, which still faces troubling issues between oversized parameters and low performance that are challenging to solve. Hence, this article proposes a novel remote sensing detection framework for multi-scale objects with a rotating status and mutual occlusion, defined as EYMR-Net. This proposed approach is established on the YOLO-v7 architecture with a Swin Transformer backbone, which offers multi-scale receptive fields to mine massive features. Then, an enhanced attention module is added to exploit the spatial and dimensional interrelationships among different local characteristics. Subsequently, the effective rotating frame regression mechanism via circular smoothing labels is introduced to the EYMR-Net structure, addressing the problem of horizontal YOLO (You Only Look Once) frames ignoring direction changes. Extensive experiments on DOTA datasets demonstrated the outstanding performance of EYMR-Net, which achieved an impressive mAP0.5 of up to 74.3%. Further ablation experiments verified that our proposed approach obtains a balance between performance and efficiency, which is beneficial for practical remote sensing applications in transportation monitoring and supply chain management.
Keywords