An Improved YOLOv8-Based Lightweight Attention Mechanism for Cross-Scale Feature Fusion

Shaodong Liu; Faming Shao; Weijun Chu; Juying Dai; Heng Zhang

doi:10.3390/rs17061044

Remote Sensing (Mar 2025)

An Improved YOLOv8-Based Lightweight Attention Mechanism for Cross-Scale Feature Fusion

Shaodong Liu,
Faming Shao,
Weijun Chu,
Juying Dai,
Heng Zhang

Affiliations

Shaodong Liu: College of Field Engineering, Army Engineering University of PLA, Nanjing 210007, China
Faming Shao: College of Field Engineering, Army Engineering University of PLA, Nanjing 210007, China
Weijun Chu: College of Field Engineering, Army Engineering University of PLA, Nanjing 210007, China
Juying Dai: College of Field Engineering, Army Engineering University of PLA, Nanjing 210007, China
Heng Zhang: College of Field Engineering, Army Engineering University of PLA, Nanjing 210007, China

DOI: https://doi.org/10.3390/rs17061044
Journal volume & issue: Vol. 17, no. 6
p. 1044

Abstract

Read online

This paper addresses the challenge of small object detection in remote sensing image recognition by proposing an improved YOLOv8-based lightweight attention cross-scale feature fusion model named LACF-YOLO. Prior to the backbone network outputting feature maps, this model introduces a lightweight attention module, Triplet Attention, and replaces the Concatenation with Fusion (C2f) with a more convenient and higher-performing dilated inverted convolution layer to acquire richer contextual information during the feature extraction phase. Additionally, it employs convolutional blocks composed of partial convolution and pointwise convolution as the main body of the cross-scale feature fusion network to integrate feature information from different levels. The model also utilizes the faster-converging Focal EIOU loss function to enhance accuracy and efficiency. Experimental results on the DOTA and VisDrone2019 datasets demonstrate the effectiveness of the improved model. Compared to the original YOLOv8 model, LACF-YOLO achieves a 2.9% increase in mAP and a 4.6% increase in mAPS on the DOTA dataset and a 3.5% increase in mAP and a 3.8% increase in mAPS on the VisDrone2019 dataset, with a 34.9% reduction in the number of parameters and a 26.2% decrease in floating-point operations. The model exhibits superior performance in aerial object detection.

Published in Remote Sensing

ISSN: 2072-4292 (Online)
Publisher: MDPI AG
Country of publisher: Switzerland
LCC subjects: Science
Website: http://www.mdpi.com/journal/remotesensing/

About the journal

Abstract

Keywords