IEEE Access (Jan 2019)
Toward Encoding Vision-to-Touch With Convolutional Neural Networks
Abstract
The task of encoding visual information into tactile information has been studied since the 1960s. There is still an open challenge in converting the data of an image into a small set of signals that will be sent to the user as tactile input. In this study, we evaluated two methods that have never been used for encoding vision-to-touch using convolutional neural networks, a bag of convolutional features (BoF) and a vector of locally aggregated descriptors (VLAD). We also present here a very new method for evaluating the semantic property of the encoded signal by taking the idea that objects with similar features must have similar signals in the tactile interface; we created a semantic property evaluation (SPE) metric. Using this metric, we proved the advantage of using the BoF and VLAD methods, obtaining an SPE of 70.7% and 64.5%, respectively, which is a considerable improvement over the downscaling method used by many systems such as BrainPort, with 56.2%.
Keywords