Jisuanji kexue (Nov 2021)

Study on Text Retrieval Based on Pre-training and Deep Hash

  • ZOU Ao, HAO Wen-ning, JIN Da-wei, CHEN Gang, TIAN Yuan

DOI
https://doi.org/10.11896/jsjkx.210300266
Journal volume & issue
Vol. 48, no. 11
pp. 300 – 306

Abstract

Read online

Aiming at the problem of low retrieval efficiency and accuracy in text retrieval,a retrieval model based on pre-trained language model and deep hash method is proposed.Firstly,the prior knowledge of text contained in the pre-trained language model is introduced by transfer learning,and then the input is transformed into high-dimensional vector representation by feature extraction.A hash learning layer is added to the back end of the whole model to fine tune the parameters of the model by designing specific optimization objectives,so as to dynamically learn the hash function and the unique hash representation of each input in the training.Experimental results show that the retrieval accuracy of this method is at least 21.70% and 21.38% higher than that of other benchmark models in top-5 and top-10,respectively.The introduction of hash code makes the model improve the retrieval speed by 40 times under the premise of only losing 4.78% accuracy.Therefore,this method can significantly improve the retrieval accuracy and efficiency,and has a potential application prospect in the field of text retrieval.

Keywords