A Hybrid Semantic Representation Method Based on Fusion Conceptual Knowledge and Weighted Word Embeddings for English Texts

Zan Qiu; Guimin Huang; Xingguo Qin; Yabing Wang; Jiahao Wang; Ya Zhou

doi:10.3390/info15110708

Information (Nov 2024)

A Hybrid Semantic Representation Method Based on Fusion Conceptual Knowledge and Weighted Word Embeddings for English Texts

Zan Qiu,
Guimin Huang,
Xingguo Qin,
Yabing Wang,
Jiahao Wang,
Ya Zhou

Affiliations

Zan Qiu: Guangxi Key Laboratory of Image and Graphic Intelligent Processing, School of Computer Science and Information Security, Guilin University of Electronic Technology, Guilin 541004, China
Guimin Huang: Guangxi Key Laboratory of Image and Graphic Intelligent Processing, School of Computer Science and Information Security, Guilin University of Electronic Technology, Guilin 541004, China
Xingguo Qin: Guangxi Key Laboratory of Image and Graphic Intelligent Processing, School of Computer Science and Information Security, Guilin University of Electronic Technology, Guilin 541004, China
Yabing Wang: Guangxi Key Laboratory of Image and Graphic Intelligent Processing, School of Computer Science and Information Security, Guilin University of Electronic Technology, Guilin 541004, China
Jiahao Wang: Guangxi Key Laboratory of Image and Graphic Intelligent Processing, School of Computer Science and Information Security, Guilin University of Electronic Technology, Guilin 541004, China
Ya Zhou: Guangxi Key Laboratory of Image and Graphic Intelligent Processing, School of Computer Science and Information Security, Guilin University of Electronic Technology, Guilin 541004, China

DOI: https://doi.org/10.3390/info15110708
Journal volume & issue: Vol. 15, no. 11
p. 708

Abstract

Read online

The accuracy of traditional topic models may be compromised due to the sparsity of co-occurring vocabulary in the corpus, whereas conventional word embedding models tend to excessively prioritize contextual semantic information and inadequately capture domain-specific features in the text. This paper proposes a hybrid semantic representation method that combines a topic model that integrates conceptual knowledge with a weighted word embedding model. Specifically, we construct a topic model incorporating the Probase concept knowledge base to perform topic clustering and obtain topic semantic representation. Additionally, we design a weighted word embedding model to enhance the contextual semantic information representation of the text. The feature-based information fusion model is employed to integrate the two textual representations and generate a hybrid semantic representation. The hybrid semantic representation model proposed in this study was evaluated based on various English composition test sets. The findings demonstrate that the model presented in this paper exhibits superior accuracy and practical value compared to existing text representation methods.

Published in Information

ISSN: 2078-2489 (Online)
Publisher: MDPI AG
Country of publisher: Switzerland
LCC subjects: Technology: Technology (General): Industrial engineering. Management engineering: Information technology
Website: http://www.mdpi.com/journal/information/

About the journal

Abstract

Keywords