Advancements in Large-Scale Image and Text Representation Learning: A Comprehensive Review and Outlook

Yang Qin; Shuxue Ding; Huiming Xie

doi:10.1109/access.2025.3541194

IEEE Access (Jan 2025)

Advancements in Large-Scale Image and Text Representation Learning: A Comprehensive Review and Outlook

Yang Qin,
Shuxue Ding,
Huiming Xie

Affiliations

Yang Qin: ORCiD; Guangxi Colleges and Universities Key Laboratory of AI Algorithm Engineering, School of Artificial Intelligence, Guilin University of Electronic Technology, Guilin, China
Shuxue Ding: ORCiD; Guangxi Colleges and Universities Key Laboratory of AI Algorithm Engineering, School of Artificial Intelligence, Guilin University of Electronic Technology, Guilin, China
Huiming Xie: ORCiD; Engineering Comprehensive Training Center, Guilin University of Aerospace Technology, Guilin, China

DOI: https://doi.org/10.1109/access.2025.3541194
Journal volume & issue: Vol. 13
pp. 49922 – 49933

Abstract

Read online

Large-scale image and text representation learning is critical in determining the performance of multimodal tasks involving images and text, such as visual question answering and image captioning. Most existing research on large-scale image and text representation learning relies on Transformer networks for pre-training, i.e., learning generic semantic representations from large-scale image-to-text pairs. These representations are then fine-tuned and transferred to downstream multimodal tasks. This paper first provides a brief analysis of the advantages of pre-training models. It then comprehensively summarizes the relevant research on large-scale image and text representation learning based on pre-training. The focus is on pre-training model architectures, pre-training tasks, and image-text datasets. Finally, we provide a summary and outlook of large-scale image and text representation learning.

Published in IEEE Access

ISSN: 2169-3536 (Online)
Publisher: IEEE
Country of publisher: United States
LCC subjects: Technology: Electrical engineering. Electronics. Nuclear engineering
Website: https://ieeexplore.ieee.org/xpl/RecentIssue.jsp?punumber=6287639

About the journal

Abstract

Keywords