PLoS ONE (Jan 2024)
Advancements in urban scene segmentation using deep learning and generative adversarial networks for accurate satellite image analysis.
Abstract
In the urban scene segmentation, the "image-to-image translation issue" refers to the fundamental task of transforming input images into meaningful segmentation maps, which essentially involves translating the visual information present in the input image into semantic labels for different classes. When this translation process is inaccurate or incomplete, it can lead to failed segmentation results where the model struggles to correctly classify pixels into the appropriate semantic categories. The study proposed a conditional Generative Adversarial Network (cGAN), for creating high-resolution urban maps from satellite images. The method combines semantic and spatial data using cGAN framework to produce realistic urban scenes while maintaining crucial details. To assess the performance of the proposed method, extensive experiments are performed on benchmark datasets, the ISPRS Potsdam and Vaihingen datasets. Intersection over Union (IoU) and Pixel Accuracy are two quantitative metrics used to evaluate the segmentation accuracy of the produced maps. The proposed method outperforms traditional methods with an IoU of 87% and a Pixel Accuracy of 93%. The experimental findings show that the suggested cGAN-based method performs better than traditional techniques, attaining better segmentation accuracy and generating better urban maps with finely detailed information. The suggested approach provides a framework for resolving the image-to-image translation difficulties in urban scene segmentation, demonstrating the potential of cGANs for producing excellent urban maps from satellite data.