大数据 (Mar 2024)

A text classification method based on multimodal fusion enhancement

  • Dezhi LIU,
  • Liu HE,
  • Youfeng LIU,
  • Dechun HAN

Journal volume & issue
Vol. 10
pp. 80 – 93

Abstract

Read online

Although multimodal text classification techniques have potential when applied to specific scenarios, there are still some limitations.Existing multimodal fusion models require modal alignment in the input data, resulting in a large amount of incomplete multimodal data being directly discarded, thus limiting the scale and flexibility of available data for inference.To address this problem, we proposed a text classification model based on multimodal fusion enhancement and an insufficient multimodal resource training method.Compared with traditional methods, our model had shown an improved performance of an average of 4.25% on a standard dataset.Furthermore, when the missing rate of other modalities except for text input was 50%, using the insufficient multimodal resource training method improved the performance by about 4% compared with traditional multi-route strategies.The experimental results demonstrate the effectiveness of the proposed model and training method.

Keywords