Applied Sciences (Jan 2021)

Automatic Word Spacing of Korean Using Syllable and Morpheme

  • Jeong-Myeong Choi,
  • Jong-Dae Kim,
  • Chan-Young Park,
  • Yu-Seop Kim

DOI
https://doi.org/10.3390/app11020626
Journal volume & issue
Vol. 11, no. 2
p. 626

Abstract

Read online

In Korean, spacing is very important to understand the readability and context of sentences. In addition, in the case of natural language processing for Korean, if a sentence with an incorrect spacing is used, the structure of the sentence is changed, which affects performance. In the previous study, spacing errors were corrected using n-gram based statistical methods and morphological analyzers, and recently many studies using deep learning have been conducted. In this study, we try to solve the spacing error correction problem using both the syllable-level and morpheme-level. The proposed model uses a structure that combines the convolutional neural network layer that can learn syllable and morphological pattern information in sentences and the bidirectional long short-term memory layer that can learn forward and backward sequence information. When evaluating the performance of the proposed model, the accuracy was evaluated at the syllable-level, and also precision, recall, and f1 score were evaluated at the word-level. As a result of the experiment, it was confirmed that performance was improved from the previous study.

Keywords