Applied Artificial Intelligence (Jan 2018)

Fuzzy String Matching with a Deep Neural Network

  • Daniel Shapiro,
  • Nathalie Japkowicz,
  • Mathieu Lemay,
  • Miodrag Bolic

DOI
https://doi.org/10.1080/08839514.2018.1448137
Journal volume & issue
Vol. 32, no. 1
pp. 1 – 12

Abstract

Read online

A deep learning neural network for character-level text classification is described in this work. The system spots keywords in the text output of an optical character recognition system using memoization and by encoding the text into feature vectors related to letter frequency. Recognizing error messages in a set of generated images, dictionary and spell-check-based approaches achieved 69% to 88% accuracy, while various deep learning approaches achieved 91% to 96% accuracy, and a combination of deep learning with a dictionary achieved 97% accuracy. The contribution of this work to the state of the art is to describe a new approach for character-level deep neural network classification of noisy text.