Journal of Intelligent Systems (Apr 2017)

Contextualized Latent Semantic Indexing: A New Approach to Automated Chinese Essay Scoring

  • Xu Yanyan,
  • Ke Dengfeng,
  • Su Kaile

DOI
https://doi.org/10.1515/jisys-2015-0048
Journal volume & issue
Vol. 26, no. 2
pp. 263 – 285

Abstract

Read online

The writing part in Chinese language tests is badly in need of a mature automated essay scoring system. In this paper, we propose a new approach applied to automated Chinese essay scoring (ACES), called contextualized latent semantic indexing (CLSI), of which Genuine CLSI and Modified CLSI are two versions. The n-gram language model and the weighted finite-state transducer (WFST), two critical components, are used to extract context information in our ACES system. Not only does CLSI improve conventional latent semantic indexing (LSI), but bridges the gap between latent semantics and their context information, which is absent in LSI. Moreover, CLSI can score essays from the perspectives of language fluency and contents, and address the local overrating and underrating problems caused by LSI. Experimental results show that CLSI outperforms LSI, Regularized LSI, and latent Dirichlet allocation in many aspects, and thus, proves to be an effective approach.

Keywords