Jisuanji kexue (Oct 2021)

Fusion Vectorized Representation Learning of Multi-source Heterogeneous User-generated Contents

  • JI Nan-xun, SUN Xiao-yan, LI Zhen-qi

DOI
https://doi.org/10.11896/jsjkx.200900194
Journal volume & issue
Vol. 48, no. 10
pp. 51 – 58

Abstract

Read online

With the development of mobile networks and APPs,user generated contents (UGC) containing multi-source heterogeneous data such as evaluations,markings,scoring,images and videos are greatly valuable information for improving the quality of personalized services.The representation learning of fusion and vectorization on the multi-source heterogeneous UGC is the most critical issue for the successful application.Motivated by this,we propose a representation learning method for effectively fusing and vectorizing the comments and image data.We utilize the Doc2vec and LDA models to sufficiently extract the features of the multi-source comments.The images correlated with the comments are represented with deep convolutional network.A hybrid vectorized representation learning for fusing comments and a convolution strategy for integrating images and comments are presented.The feasibility and effectiveness of the proposed method is demonstrated by applying it to typical Amazon public data sets with heterogeneous UGC,in which the vectorized multi-source heterogeneous UGC is taken as the representation of each product and the classification accuracy of the products are compared.

Keywords