IEEE Access (Jan 2023)

Learning and Adaptation From Minimum Samples With Heterogeneous Quality: An Investigation of Image Segmentation Networks on Natural Dataset

  • V. V. Sajith Variyar,
  • V. Sowmya,
  • Ramesh Sivanpillai,
  • Gregory K. Brown

DOI
https://doi.org/10.1109/ACCESS.2023.3275748
Journal volume & issue
Vol. 11
pp. 47040 – 47052

Abstract

Read online

Training deep learning-based image segmentation networks require large number of samples of adequate quality. However, obtaining large number of samples is not possible in certain domains. Recent approaches use augmentation and transfer learning techniques to overcome small sample size. Augmentation techniques are known to introduce noise to the dataset, while transfer learning approaches may fail if the existing dataset is novel to deep learning algorithms. This study investigated how four deep learning-based image segmentation networks learned and adapted to identify epiphytes when trained with fewer image samples (n = 132) of heterogeneous quality without transfer learning and data augmentation. Encoder-Decoder with skip connection (Unet), Deep Residual (DRUnet), Vision transformer (TransUnet), and Conditional Generative (Pix2Pix) represent different generations of deep learning networks. The segmentation performance of the trained models was evaluated by computing the Jaccard score (IoU) for predicted labels for test images. Test images (n = 20) with heterogeneous quality were evaluated by categorizing them into six categories based on target occupancy and lighting conditions. Results from this study showed that among the four networks, predicted images from the TransUnet model achieved high average Jaccard score of 0.78. Role of additional layers apart from Unet was important for accurate localization and context understanding of the target plants. However, these networks misclassified visually similar plants as target plant. The transformer and attention layers in TransUnet showed significant contribution towards improvement in localizing target and understanding context in images with varying quality. TransUnet can be used for segmenting target plants when fewer training samples are available. The presence of Unet based encoder-decoder in TransUnet is well contributing for deriving good features from minimum samples.

Keywords