Applied Mathematics and Nonlinear Sciences (Jan 2024)

Construction of a text complexity grading model for English textbooks in the context of globalization

  • Wu Anping,
  • Zhao Fei,
  • Zhang Qiong

DOI
https://doi.org/10.2478/amns.2023.1.00131
Journal volume & issue
Vol. 9, no. 1

Abstract

Read online

Text complexity is an important construct in the study of English reading instruction in the context of globalization and in the field of applied linguistics, and the main purpose of its study is to match learners with reading texts that are appropriate to their language level in order to improve learning outcomes. Focusing on text complexity, in addition to focusing on lexical and syntactic dimensions, discourse complexity features are added to explore the application in the construction of text complexity hierarchy models based on feature optimization to be examined. In this study, with the text processing software developed by Kyle’s team, we used BNCbaby as the reference corpus to extract fine-grained indicators such as vocabulary, syntax and discourse complexity, and used principal component analysis to optimize the indicators by dimensionality reduction and determine the principal component features used to build the model. Different classification algorithms are used to construct separate models and compare their performance. Models constructed based on common traditional readability formulas and other single-dimensional features have significant advantages over models constructed from feature sets. In addition, the feature set and modeling method have good grading ability for other domestic textbook datasets and good performance in grading prediction for several different datasets, with strong generalization and generalization ability. This study integrates multivariate linguistic features with neural networks to construct a text complexity grading model, which provides a new path for text complexity research. The research results not only have certain theoretical significance for text complexity research, but also have high application value in the field of applied linguistics, which can provide references for the selection of students’ reading materials, the writing and adaptation of teaching materials, and the planning and test design of reading courses.

Keywords