Jisuanji kexue yu tansuo (Aug 2021)

Literature Review of Cross-Modal Retrieval Research

  • CHEN Ning, DUAN Youxiang, SUN Qifeng

DOI
https://doi.org/10.3778/j.issn.1673-9418.2101092
Journal volume & issue
Vol. 15, no. 8
pp. 1390 – 1404

Abstract

Read online

With the vigorous development of Internet technology and the popularization of smart devices, while the amount of multimedia data exploding, their forms become increasingly diverse. People's demand for information is no longer satisfied with single-modal data retrieval. Realizing cross-modal retrieval through knowledge collaboration of different modalities has become a research hotspot in recent years. On the basis of in-depth understanding and analysis of the research background and progress of cross-modal retrieval, with the key technology of cross-modal retrieval, public subspace modeling as the main line, this paper analyzes three types of methods of cross-modal retrieval technology: traditional statistical analysis, deep learning, and Hash learning. This paper conducts a comprehensive and multi-angle comparative analysis on the research content, key technology, limitations, applicability and characteristics from different angles, and experiments are done for more in-depth comparisons. Finally, the difficulties to be solved in cross-modal retrieval, future exploration directions, mainstream design ideas and development trends in recent years are fully prospected to provide a theoretical basis for further research.

Keywords