Acta Linguistica Asiatica (Jul 2020)

Examining the Part-of-speech Features in Assessing the Readability of Vietnamese Texts

  • An-Vinh Luong,
  • Diep Nguyen,
  • Dien Dinh

DOI
https://doi.org/10.4312/ala.10.2.127-142
Journal volume & issue
Vol. 10, no. 2

Abstract

Read online

The readability of the text plays a very important role in selecting appropriate materials for the level of the reader. Text readability in Vietnamese language has received a lot of attention in recent years, however, studies have mainly been limited to simple statistics at the level of a sentence length, word length, etc. In this article, we investigate the role of word-level grammatical characteristics in assessing the difficulty of texts in Vietnamese textbooks. We have used machine learning models (for instance, Decision Tree, K-nearest neighbor, Support Vector Machines, etc.) to evaluate the accuracy of classifying texts according to readability, using grammatical features in word level along with other statistical characteristics. Empirical results show that the presence of POS-level characteristics increases the accuracy of the classification by 2-4%.

Keywords