Tehnički Vjesnik (Jan 2021)

A Machine Learning Classification Algorithm for Vocabulary Grading in Chinese Language Teaching

  • Yinbing Zhang,
  • Jihua Song*,
  • Weiming Peng*,
  • Dongdong Guo,
  • Tianbao Song

DOI
https://doi.org/10.17559/TV-20210128043310
Journal volume & issue
Vol. 28, no. 3
pp. 845 – 855

Abstract

Read online

Vocabulary grading is of great importance in Chinese vocabulary teaching. This paper starts with an analysis of the lexical attributes that affect lexical complexity, followed by an explanation of the extraction of lexical attribute information combined with the constructed word-formation knowledge base, the construction of mapping functions corresponding to lexical attributes, and the quantitative representation of the attributes that form the basis for vocabulary grading. Based on this, a machine learning classification algorithm is creatively applied to the Chinese vocabulary grading problem. Using the comparative analysis of vocabulary grading models based on common machine learning classification algorithms, the importance measurement analysis of Chinese vocabulary attributes based on different feature selection methods is performed, and a vocabulary grading model is constructed based on the machine learning classification algorithm and feature importance selection of different feature selection algorithms. A comparison of the experimental results demonstrated that the classification model based on the support vector machine (SVM) algorithm and top six attribute groups by the importance of feature selection received the best effect. To improve vocabulary grading, a variety of feature selection algorithms were used to fuse the importance of lexical attributes on average. Then an experiment was conducted for vocabulary grading combined with the Bagging + SVM integration algorithm and top six attribute groups by the importance of feature selection. The experimental results demonstrated that the combination scheme achieved a better effect.

Keywords