A Hybrid Semantic Representation Method Based on Fusion Conceptual Knowledge and Weighted Word Embeddings for English Texts
Zan Qiu,
Guimin Huang,
Xingguo Qin,
Yabing Wang,
Jiahao Wang,
Ya Zhou
Affiliations
Zan Qiu
Guangxi Key Laboratory of Image and Graphic Intelligent Processing, School of Computer Science and Information Security, Guilin University of Electronic Technology, Guilin 541004, China
Guimin Huang
Guangxi Key Laboratory of Image and Graphic Intelligent Processing, School of Computer Science and Information Security, Guilin University of Electronic Technology, Guilin 541004, China
Xingguo Qin
Guangxi Key Laboratory of Image and Graphic Intelligent Processing, School of Computer Science and Information Security, Guilin University of Electronic Technology, Guilin 541004, China
Yabing Wang
Guangxi Key Laboratory of Image and Graphic Intelligent Processing, School of Computer Science and Information Security, Guilin University of Electronic Technology, Guilin 541004, China
Jiahao Wang
Guangxi Key Laboratory of Image and Graphic Intelligent Processing, School of Computer Science and Information Security, Guilin University of Electronic Technology, Guilin 541004, China
Ya Zhou
Guangxi Key Laboratory of Image and Graphic Intelligent Processing, School of Computer Science and Information Security, Guilin University of Electronic Technology, Guilin 541004, China
The accuracy of traditional topic models may be compromised due to the sparsity of co-occurring vocabulary in the corpus, whereas conventional word embedding models tend to excessively prioritize contextual semantic information and inadequately capture domain-specific features in the text. This paper proposes a hybrid semantic representation method that combines a topic model that integrates conceptual knowledge with a weighted word embedding model. Specifically, we construct a topic model incorporating the Probase concept knowledge base to perform topic clustering and obtain topic semantic representation. Additionally, we design a weighted word embedding model to enhance the contextual semantic information representation of the text. The feature-based information fusion model is employed to integrate the two textual representations and generate a hybrid semantic representation. The hybrid semantic representation model proposed in this study was evaluated based on various English composition test sets. The findings demonstrate that the model presented in this paper exhibits superior accuracy and practical value compared to existing text representation methods.