IEEE Access (Jan 2021)

Multiple-Clothing Detection and Fashion Landmark Estimation Using a Single-Stage Detector

  • Hyo Jin Kim,
  • Doo Hee Lee,
  • Asim Niaz,
  • Chan Yong Kim,
  • Asif Aziz Memon,
  • Kwang Nam Choi

DOI
https://doi.org/10.1109/ACCESS.2021.3051424
Journal volume & issue
Vol. 9
pp. 11694 – 11704

Abstract

Read online

Fashion image analysis has attracted significant research attention owing to the availability of large-scale fashion datasets with rich annotations. However, existing deep learning models for fashion datasets often have high computational requirements. In this study, we propose a new model suitable for low-power devices. The proposed network is a one-stage detector that rapidly detects multiple cloths and landmarks in fashion images. The network is designed as a modification of the EfficientDet originally proposed by Google Brain. The proposed network simultaneously trains the core input features with different resolutions and applies compound scaling to the backbone feature network. The bounding box/class/landmark prediction networks maintain the balance between the speed and accuracy. Moreover, a low number of parameters and low computational cost make it efficient. Without image preprocessing, we achieved 0.686 mean average precision (mAP) in the bounding box detection and 0.450 mAP in the landmark estimation on the DeepFashion2 validation dataset with an inference time of 42 ms. We obtained optimal results in extensive experiments with loss functions and optimizers. Furthermore, the proposed method has the advantage of operating in low-power devices.

Keywords