Decision Support System for Evaluating Corpus-Based Word Lists for Use in English Language Teaching Contexts

Ruoxi Yin; Chunmei Zhu; Jiuyang Zhu

doi:10.1109/access.2025.3579865

IEEE Access (Jan 2025)

Decision Support System for Evaluating Corpus-Based Word Lists for Use in English Language Teaching Contexts

Ruoxi Yin,
Chunmei Zhu,
Jiuyang Zhu

Affiliations

Ruoxi Yin: Foreign Language Department, Xuzhou Medical University, Xuzhou, Jiangsu, China
Chunmei Zhu: ORCiD; Foreign Language Department, Xuzhou Medical University, Xuzhou, Jiangsu, China
Jiuyang Zhu: Foreign Language Department, Xuzhou Medical University, Xuzhou, Jiangsu, China

DOI: https://doi.org/10.1109/access.2025.3579865
Journal volume & issue: Vol. 13
pp. 106369 – 106386

Abstract

Read online

The selection of pedagogically relevant vocabulary is a critical aspect of English Language Teaching (ELT), yet traditional corpus-based word lists often rely solely on frequency-based ranking, limiting their effectiveness in real-world learning contexts. This study proposes a Decision Support System (DSS) that integrates corpus linguistics, Natural Language Processing (NLP), and machine learning to generate optimized vocabulary lists tailored for ELT. The DSS enhances word selection accuracy by incorporating contextual relevance, collocational strength, and pedagogical value, surpassing the limitations of static frequency-based models. The experimental evaluation compares DSS performance with traditional frequency-based methods and the academic word list (awl). The results indicate that the DSS achieves significantly higher ranking accuracy (mrr: 0.89, ndcg: 0.85) and improves classification performance (f1-score: 87.3%), ensuring that highly relevant words appear earlier in ranked lists. A case study in general and academic english contexts demonstrates that DSS-generated word lists offer greater usability, stronger contextual fit, and enhanced pedagogical effectiveness, as confirmed by positive teacher feedback (usability score: 4.7/5). Statistical validation using t-tests and anova confirms that DSS ranking improvements are statistically significant (p < 0.001), and effect size analysis (cohen’s d > 0.8) highlights its substantial impact on vocabulary instruction. Despite certain limitations such as corpus dependency and limited multilingual support, the DSS presents a scalable and adaptable solution for data-driven vocabulary selection in ELT. Future enhancements will focus on multilingual capabilities, personalized learning, and hybrid ranking models to further refine the system’s applicability in diverse educational settings.

Published in IEEE Access

ISSN: 2169-3536 (Online)
Publisher: IEEE
Country of publisher: United States
LCC subjects: Technology: Electrical engineering. Electronics. Nuclear engineering
Website: https://ieeexplore.ieee.org/xpl/RecentIssue.jsp?punumber=6287639

About the journal

Abstract

Keywords