IET Software (Jan 2024)

Design and Efficacy of a Data Lake Architecture for Multimodal Emotion Feature Extraction in Social Media

  • Yuanyuan Fan,
  • Xifeng Mi

DOI
https://doi.org/10.1049/2024/6819714
Journal volume & issue
Vol. 2024

Abstract

Read online

In the rapidly evolving landscape of social media, the demand for precise sentiment analysis (SA) on multimodal data has become increasingly pivotal. This paper introduces a sophisticated data lake architecture tailored for efficient multimodal emotion feature extraction, addressing the challenges posed by diverse data types. The proposed framework encompasses a robust storage solution and an innovative SA model, multilevel spatial attention fusion (MLSAF), adept at handling text and visual data concurrently. The data lake architecture comprises five layers, facilitating real-time and offline data collection, storage, processing, standardized interface services, and data mining analysis. The MLSAF model, integrated into the data lake architecture, utilizes a novel approach to SA. It employs a text-guided spatial attention mechanism, fusing textual and visual features to discern subtle emotional interplays. The model’s end-to-end learning approach and attention modules contribute to its efficacy in capturing nuanced sentiment expressions. Empirical evaluations on established multimodal sentiment datasets, MVSA-Single and MVSA-Multi, validate the proposed methodology’s effectiveness. Comparative analyses with state-of-the-art models showcase the superior performance of our approach, with an accuracy improvement of 6% on MVSA-Single and 1.6% on MVSA-Multi. This research significantly contributes to optimizing SA in social media data by offering a versatile and potent framework for data management and analysis. The integration of MLSAF with a scalable data lake architecture presents a strategic innovation poised to navigate the evolving complexities of social media data analytics.