IEEE Access (Jan 2020)

Hierarchical Transformer Encoder With Structured Representation for Abstract Reasoning

  • Jinwon An,
  • Sungzoon Cho

DOI
https://doi.org/10.1109/ACCESS.2020.3035463
Journal volume & issue
Vol. 8
pp. 200229 – 200236

Abstract

Read online

Abstract reasoning is one of the defining characteristics of human intelligence and can be estimated by visual IQ tests such as Raven's Progressive Matrices. In this paper, we propose using a hierarchical Transformer encoder with structured representation that employs a novel neural network architecture to improve both perception and reasoning in a visual IQ test. For perception, we used object detection models to extract the structured features. For reasoning, we used the Transformer encoder in a hierarchical manner that fits the structure of Raven's Progressive Matrices. Experimental results on the RAVEN dataset, which is one of the major large-scale datasets on Raven's Progressive Matrices, showed that our proposed architecture achieved an overall accuracy of 99.62%, which is an improvement of more than 8% points over CoPINet, the present-day, state-of-the-art neural network model.

Keywords