Bulletin of the Polish Academy of Sciences: Technical Sciences (Jan 2021)
Recognition of handwritten Latin characters with diacritics using CNN
Abstract
Convolutional Neural Networks (CNN) have achieved huge popularity in solving problems in image analysis and in text recognition. In this work, we assess the effectiveness of CNN-based architectures where a network is trained in recognizing handwritten characters based on Latin script. European languages such as Dutch, French, German, etc., use different variants of the Latin script, so in the conducted research, the Latin alphabet was extended by certain characters with diacritics used in Polish language. To evaluate the recognition results under the same conditions, a handwritten Latin dataset was also developed. The proposed CNN architecture produced an accuracy of 96% for the extended character set. This is comparable to state-of-the-art results found in the domain of identifying handwritten characters. The presented approach extends the usage of CNN-based recognition to different variants of the Latin characters and shows it can be successfully used for a set of languages based on that script. It seems to be an effective technique for a set of languages written using the Latin script.
Keywords