IEEE Access (Jan 2020)

Spatially Variant Convolutional Autoencoder Based on Patch Division for Pill Defect Detection

  • Sora Kim,
  • Youngjae Jo,
  • Jungchan Cho,
  • Jiwoo Song,
  • Younyoung Lee,
  • Minsik Lee

DOI
https://doi.org/10.1109/ACCESS.2020.3041790
Journal volume & issue
Vol. 8
pp. 216781 – 216792

Abstract

Read online

Detecting pill defection remains challenging, despite recent extensive studies, because of the lack of defective data. In this paper, we propose a pipeline composed of a pill detection module and an autoencoder-based defect detection module to detect defective pills in pill packages. Furthermore, we created a new dataset to test our model. The pill detection module segments pills in an aluminum-plastic package into individual pills. To segment pills, we used a shallow segmentation network that is then divided into individual pills using the watershed algorithm. The defect detection module identifies defects in individual pills. It is trained only on the normal data. Thus, it is expected that the module will be unable to reconstruct defective data correctly. However, in reality, the conventional autoencoder reconstructs defective data better than expected, even if the network is trained only on normal data. Hence, we introduce a patch division method to prevent this problem. The patch division involves dividing the output of the convolutional encoder network into patch-wise features, and then applying patch-wise encoder layer. In this process, each latent patch has its independent weight and bias. This can be interpreted as reconstructing the input image using multiple local autoencoders. The patch division makes the network concentrate only on reconstructing local regions, thereby reducing the overall capacity. This prohibits the proposed network reconstructing unseen data well. Experiments show that the proposed patch division technique indeed improves the defect detection performance and outperforms existing deep learning based anomaly detection methods. The ablation study shows the efficacy of patch division and compression following the concatenation of patch-wise features.

Keywords