An Improved Small Target Segmentation Model Based on Mask Dino

Jun Yang; Xu Chen; Yun Guan; Yixuan Hu; Gang Ge

doi:10.3390/app15041832

Applied Sciences (Feb 2025)

An Improved Small Target Segmentation Model Based on Mask Dino

Jun Yang,
Xu Chen,
Yun Guan,
Yixuan Hu,
Gang Ge

Affiliations

Jun Yang: School of Computer and Information Engineering, Jiangxi Agricultural University, Nanchang 330045, China
Xu Chen: School of Software, Jiangxi Agricultural University, Nanchang 330045, China
Yun Guan: School of Software, Jiangxi Agricultural University, Nanchang 330045, China
Yixuan Hu: School of Software, Jiangxi Agricultural University, Nanchang 330045, China
Gang Ge: School of Computer and Information Engineering, Jiangxi Agricultural University, Nanchang 330045, China

DOI: https://doi.org/10.3390/app15041832
Journal volume & issue: Vol. 15, no. 4
p. 1832

Abstract

Read online

To address the issue of low segmentation accuracy for small objects in the Mask Dino segmentation method, we propose an improved small object segmentation model called FFMask Dino. Initially, we introduce scaled cosine attention and the log-cpb method into the Swin Transformer backbone network. Subsequently, by adjusting the network structure, we enhance the feature extraction process, which helps the model maintain generalization across different datasets and reduces the risk of overfitting. Lastly, we propose the FFPN module to optimize the pathways for feature fusion and transmission. The improved FPN reduces unnecessary computations, accelerates model inference speed, and integrates multi-scale feature details and high-level semantic information to complement object features, thereby enhancing model segmentation accuracy. Experimental results demonstrate that the improved segmentation model achieves a mean Intersection over Union (mIoU) of 42.15% on the ADE20K dataset for semantic segmentation tasks, representing a 0.96% increase compared to the Mask Dino method. On the CoCo dataset for instance segmentation tasks, with the Swin Transformer backbone, the Mask AP and Box AP are 47.10 and 52.60, respectively, showing improvements of 1% and 1.3% over the Mask Dino method. With the ResNet-50 backbone, the Mask AP and Box AP are 40.00 and 44.10, respectively, with improvements of 0.5% and 0.9% over the Mask Dino method. For the CoCo dataset’s panoptic segmentation tasks, with the Swin Transformer backbone, the PQ is 54.95, showing a 0.4% increase over the Mask Dino method. With the ResNet-50 backbone, the PQ is 46.93, showing a 0.9% increase over the Mask Dino method. These results effectively demonstrate the improved accuracy and precision of Mask Dino in segmenting small objects across various segmentation tasks.

Published in Applied Sciences

ISSN: 2076-3417 (Online)
Publisher: MDPI AG
Country of publisher: Switzerland
LCC subjects: Technology: Engineering (General). Civil engineering (General); Science: Biology (General); Science: Physics; Science: Chemistry
Website: http://www.mdpi.com/journal/applsci

About the journal

Abstract

Keywords