IEEE Access (Jan 2024)

Land Cover Classification From RGB and NIR Satellite Images Using Modified U-Net Model

  • Won-Kyung Baek,
  • Moung-Jin Lee,
  • Hyung-Sup Jung

DOI
https://doi.org/10.1109/ACCESS.2024.3401416
Journal volume & issue
Vol. 12
pp. 69445 – 69455

Abstract

Read online

Multi-spectral satellite imagery has been widely used for land cover classification, because it provides meaningful spectral information for Earth’s objects that are difficult to be described by using visible band images. The near-infrared image enables us to classify in the fields of agriculture, forestry, and geology/natural resources. However, the classification performances obtained from deep learning approaches using both red-green-blue (RGB) and near-infrared (NIR) images were not significantly superior to the classification performances using the RGB image, because the spectral information may not be appropriately applied to the deep learning methods. In most deep learning approaches, the convolution operation does not separate the pixel values in the band direction, but rather mixes all the pixel values. This mixing can lead to the loss of information, particularly when dealing with multi-band images (like satellite imagery), as important spectral information might be obscured, affecting the model’s accuracy and generalization capability. To overcome the drawback, this study presents an efficient model, which is the separated-input-based U-Net (SiU-Net), via modifying the U-Net model based on the separation of RGB and NIR images. To show the performance improvement of land cover classification from the SiU-Net, the performance of SiU-Net was compared with those of the DeepLabV3+ and U-Net models. We utilized a 2020 satellite-derived land cover dataset, consisting of 300 patches in total. These patches were extracted from Sentinel-2 images, including both RGB and NIR bands, with a resolution of 10 meters, and each patch was sliced into $512\times 512$ pixel segments. The entire set of 300 patches was selected without overlap, adhering to a distribution ratio of approximately 64% (192 patches) for training, 16% (48 patches) for validation, and 20% (60 patches) for testing. The final performance evaluations were ultimately conducted using the test data. The F1 score obtained from SiU-Net were about 0.797, and it was superior to about 0.541 from DeepLabV3+ and 0.720 from U-Net. Moreover, the F1 scores of SiU-Net (0.589) was more accurate than DeepLabV3+ (0.051) and U-Net (0.455) in the small training data, and the performance degradation due to data imbalance was reduced in the SiU-Net model. This means that the SiU-Net model may be most suitable when the training data are small and unbalanced.

Keywords