Ground-Based Cloud Image Segmentation Method Based on Improved U-Net

Deyang Yin; Jinxin Wang; Kai Zhai; Jianfeng Zheng; Hao Qiang

doi:10.3390/app142311280

Applied Sciences (Dec 2024)

Ground-Based Cloud Image Segmentation Method Based on Improved U-Net

Deyang Yin,
Jinxin Wang,
Kai Zhai,
Jianfeng Zheng,
Hao Qiang

Affiliations

Deyang Yin: School of Mechanical Engineering and Rail Transit, Changzhou University, Changzhou 213164, China
Jinxin Wang: School of Mechanical Engineering and Rail Transit, Changzhou University, Changzhou 213164, China
Kai Zhai: School of Mechanical Engineering and Rail Transit, Changzhou University, Changzhou 213164, China
Jianfeng Zheng: School of Mechanical Engineering and Rail Transit, Changzhou University, Changzhou 213164, China
Hao Qiang: School of Mechanical Engineering and Rail Transit, Changzhou University, Changzhou 213164, China

DOI: https://doi.org/10.3390/app142311280
Journal volume & issue: Vol. 14, no. 23
p. 11280

Abstract

Read online

Cloud image segmentation is a technique that divides images captured by meteorological satellites or ground-based observations into different regions or categories. By extracting the distribution, shape, and dynamic features of clouds, it provides precise data support for the meteorological and environmental fields, significantly influencing photovoltaic (PV) power generation forecasting, astronomical telescope observatory site selection, and weather forecasting. A ground-based cloud image segmentation model based on an improved U-Net is proposed, which adopts an overall encoder–decoder structure. In the encoder phase, this paper constructs a dilated convolution–atrous spatial pyramid pooling (ASPP)–dilated convolution structure to enhance early cloud feature extraction. Dilated convolution is a novel type of convolution that expands the receptive field by inserting holes into standard convolution, thereby capturing a larger range of contextual information. ASPP maintains high resolution while paying attention to both local details and global structures of the image. In the decoder stage, the bicubic interpolation method is used for up-sampling to restore the feature map resolution and improve the clarity of the segmented image. The bicubic interpolation method refers to the use of cubic polynomial functions to interpolate the pixel values of the input image. In addition, this paper designs a novel skip connection layer structure between the encoder and decoder, composed of a depthwise separable path (DS path) and an improved channel spatial attention module (Im-CSAM) connected in sequence. The DS path combines depthwise separable convolutions and residual structures to facilitate information exchange between high-level and low-level features. The Im-CSAM is a modular attention mechanism that focuses on important cloud features in spatial and channel dimensions to enhance segmentation accuracy. Experiments show that compared to the traditional U-Net, the accuracy, precision, and MIoU of this model improved by 2.2%, 4.1%, and 5.0%, respectively, in the SWINySEG dataset, and by 3.2%, 3.6%, and 5.8%, respectively, in the TCDD dataset, proving that the improved method has a better generalization ability and segmentation performance.

Published in Applied Sciences

ISSN: 2076-3417 (Online)
Publisher: MDPI AG
Country of publisher: Switzerland
LCC subjects: Technology: Engineering (General). Civil engineering (General); Science: Biology (General); Science: Physics; Science: Chemistry
Website: http://www.mdpi.com/journal/applsci

About the journal

Abstract

Keywords