Jisuanji kexue (May 2022)
Study on Cross-media Information Retrieval Based on Common Subspace Classification Learning
Abstract
The semantic similarity between two different media data can not be calculated directly because of the serious heterogeneous gap and semantic gap between them,which affects the implementation and effect of cross media retrieval.Although the common space learning can achieve cross media semantic association and retrieval,the retrieval performance is not satisfied.The main reason is that it uses common feature extraction technology and general classification algorithm to implement semantic correlation and match.Aiming at this problem,the study proposes a novel cross media correlation method called Stacking-DSCM-WR for cross media retrieval between documents and images.WR means that text feature extraction is based on word-embedding technique and the image feature extraction is based on ResNet technique.DSCM means that the deep semantic correlation and match technology is exploited to project data of different modalities into a common subspace.Stacking is a kind of ensemble lear-ning algorithm.It is employed to produce the distribution of text documents and images on the same high-level conceptual semantic space for cross-media retrieval.The experiments are carried out on two smaller cross-media datasets,Wikipedia and Pascal Sentence,and one larger cross-media dataset,INRIA-Websearch,respectively.The results show that the proposed method can effectively extract the features of text and image,and realize the correlation and match of cross media data in high-level semantic space.The comparisons with similar cross media retrieval methods show that the proposed method achieves the best retrieval effect based on MAP metric.
Keywords