Learning Transformation-Invariant Representations for Image Recognition With Drop Transformation Networks

Chunxiao Fan; Yang Li; Guijin Wang; Yong Li

doi:10.1109/ACCESS.2018.2850965

IEEE Access (Jan 2018)

Learning Transformation-Invariant Representations for Image Recognition With Drop Transformation Networks

Chunxiao Fan,
Yang Li,
Guijin Wang,
Yong Li

Affiliations

Chunxiao Fan: School of Electronic Engineering, Beijing University of Posts and Telecommunications, Beijing, China
Yang Li: ORCiD; School of Electronic Engineering, Beijing University of Posts and Telecommunications, Beijing, China
Guijin Wang: ORCiD; Department of Electronic Engineering, Tsinghua University, Beijing, China
Yong Li: School of Electronic Engineering, Beijing University of Posts and Telecommunications, Beijing, China

DOI: https://doi.org/10.1109/ACCESS.2018.2850965
Journal volume & issue: Vol. 6
pp. 73357 – 73369

Abstract

Read online

This paper proposes a drop transformation networks (DTNs), a novel framework of learning transformation-invariant representations of images with good flexibility and generalization ability. Convolutional neural networks are a powerful end-to-end learning framework that can learn hierarchies of representations. Although the invariance to translation of the representations can be introduced by the approach of stacking convolutional and max-pooling layers, the approach is not effective in tackling other geometric transformations such as rotation and scale. Rotation and scale invariance are usually obtained through data augmentation, but this requires larger model size and more training time. DTN formulates transformation-invariant representations through explicitly manipulating geometric transformations within it. DTN applies multiple random transformations to its inputs but keeps only one output according to the given dropout policy. In this way, the complex dependencies of the knowledge on transformations contained in training data can be alleviated, and therefore the generalization to transformations is improved. Another advantage of DTN is the flexibility. Under the proposed framework, data augmentation can be seen as a special case. We evaluate DTN on three benchmark data sets and show that it can provide better performance with smaller number of parameters compared to state-of-the-art methods.

Published in IEEE Access

ISSN: 2169-3536 (Online)
Publisher: IEEE
Country of publisher: United States
LCC subjects: Technology: Electrical engineering. Electronics. Nuclear engineering
Website: https://ieeexplore.ieee.org/xpl/RecentIssue.jsp?punumber=6287639

About the journal

Abstract

Keywords