Exploring Prompts in Few-Shot Cross-Linguistic Topic Classification Scenarios

Zhipeng Zhang; Shengquan Liu; Jianming Cheng

doi:10.3390/app13179944

Applied Sciences (Sep 2023)

Exploring Prompts in Few-Shot Cross-Linguistic Topic Classification Scenarios

Zhipeng Zhang,
Shengquan Liu,
Jianming Cheng

Affiliations

Zhipeng Zhang: College of Information Science and Engineering, Xinjiang University, Urumqi 830046, China
Shengquan Liu: College of Information Science and Engineering, Xinjiang University, Urumqi 830046, China
Jianming Cheng: College of Information Science and Engineering, Xinjiang University, Urumqi 830046, China

DOI: https://doi.org/10.3390/app13179944
Journal volume & issue: Vol. 13, no. 17
p. 9944

Abstract

Read online

In recent years, large-scale pretrained language models have become widely used in natural language processing tasks. On this basis, prompt learning has achieved excellent performance in specific few-shot classification scenarios. The core idea of prompt learning is to convert a downstream task into a masked language modelling task. However, different prompt templates can greatly affect the results, and finding an appropriate template is difficult and time-consuming. To this end, this study proposes a novel hybrid prompt approach, which combines discrete prompts and continuous prompts, to motivate the model to learn more semantic knowledge from a small number of training samples. By comparing the performance difference between discrete prompts and continuous prompts, we find that hybrid prompts achieve the best results, reaching a 73.82% F1 value in the test set. In addition, we analyze the effect of different virtual token lengths in continuous prompts and hybrid prompts in a few-shot cross-language topic classification scenario. The results demonstrate that there is a threshold for the length of virtual tokens, and too many virtual tokens decrease the performance of the model. It is better not to exceed the average length of the training set corpus. Finally, this paper designs a method based on vector similarity to explore the real meanings represented by virtual tokens. The experimental results show that the prompt automatically learnt from the virtual token has a certain correlation with the input text.

Published in Applied Sciences

ISSN: 2076-3417 (Online)
Publisher: MDPI AG
Country of publisher: Switzerland
LCC subjects: Technology: Engineering (General). Civil engineering (General); Science: Biology (General); Science: Physics; Science: Chemistry
Website: http://www.mdpi.com/journal/applsci

About the journal

Abstract

Keywords