Applied Sciences (Oct 2022)

Automatic Correction of Indonesian Grammatical Errors Based on Transformer

  • Ahmad Musyafa,
  • Ying Gao,
  • Aiman Solyman,
  • Chaojie Wu,
  • Siraj Khan

DOI
https://doi.org/10.3390/app122010380
Journal volume & issue
Vol. 12, no. 20
p. 10380

Abstract

Read online

Grammatical error correction (GEC) is one of the major tasks in natural language processing (NLP) which has recently attracted great attention from researchers. The performance of universal languages such as English and Chinese in the GEC system has improved significantly. This could be attributed to the large number of powerful applications supported by neural network models and pretrained language models. Referring to the satisfactory results of the universal language in the GEC task and the lack of research on the GEC task for low-resource languages, especially Indonesian, this paper proposes an automatic model for Indonesian grammar correction based on the Transformer architecture which can be applied to other low-resource language texts. Furthermore, we build a large corpus of the Indonesian language that can be utilized for evaluating the next Indonesian GEC task. We evaluate the models in this dataset, and the results show that the Transformer-based automatic error correction model achieved significant and satisfactory results compared with the results of previous research models.

Keywords