Long Text Generation Adversarial Network Model with Self-Attention Mechanism

XIA Hongbin, XIAO Yifei, LIU Yuan

doi:10.3778/j.issn.1673-9418.2104038

Jisuanji kexue yu tansuo (Jul 2022)

Long Text Generation Adversarial Network Model with Self-Attention Mechanism

XIA Hongbin, XIAO Yifei, LIU Yuan

Affiliations

XIA Hongbin, XIAO Yifei, LIU Yuan: 1. School of A.pngicial Intelligence and Computer, Jiangnan University, Wuxi, Jiangsu 214122, China;2. Jiangsu Key Laboratory of Media Design and Software Technology, Wuxi, Jiangsu 214122, China

DOI: https://doi.org/10.3778/j.issn.1673-9418.2104038
Journal volume & issue: Vol. 16, no. 7
pp. 1603 – 1610

Abstract

Read online

In recent years, the communication between human and computer has reached the inseparable degree, so the natural language processing as interaction technology between human and machine attracts more and more attention from researchers. Text generation is one of the common tasks of natural language processing. Currently, generative adversarial networks (GAN) is widely used in the field of text generation and the performance is excellent. To solve the sparse problem of scalar guided signals in traditional generative adversarial network discriminator and the limitation of learning only partial semantic information of text, a model based on the combination of multi-head self-attention leak generative adversarial networks (SALGAN) is proposed. Firstly, the feature vector is extracted by using the CNN model integrated with the multi-head self-attention mechanism as the feature extractor to enhance the feature extraction ability. Secondly, the features extracted by the discriminator are sent to the generator as step-by-step guidance signals to guide the generator to generate text, which makes the generated text more inclined to the reference text. Finally, the generator generates the text and passes it to the discriminator to determine whether it is true or not, in order to confirm whether the text meets the standards of human language. Experiments are carried out on two real datasets, COCO image captions and EMNLP2017 news, and the BLEU index is used for evaluation. The experimental results show that the text contains global semantic information after the multi-head self-attention mechanism is integrated into the CNN model, and the feature extraction performance of the CNN model is significantly improved.

|generative adversarial networks (gan)|multi-head self-attention mechanism|text generation|deep learning|gated recurrent unit (gru)|natural language processing

Published in Jisuanji kexue yu tansuo

ISSN: 1673-9418 (Print)
Publisher: Journal of Computer Engineering and Applications Beijing Co., Ltd., Science Press
Country of publisher: China
LCC subjects: Science: Mathematics: Instruments and machines: Electronic computers. Computer science
Website: http://fcst.ceaj.org

About the journal

Abstract

Keywords