Multi-modal deep learning framework for damage detection in social media posts

Jiale Zhang; Manyu Liao; Yanping Wang; Yifan Huang; Fuyu Chen; Chiba Makiko

doi:10.7717/peerj-cs.2262

PeerJ Computer Science (Aug 2024)

Multi-modal deep learning framework for damage detection in social media posts

Jiale Zhang,
Manyu Liao,
Yanping Wang,
Yifan Huang,
Fuyu Chen,
Chiba Makiko

Affiliations

Jiale Zhang: School of Journalism and Communication, Nanchang University, Nanchang, China
Manyu Liao: School of Journalism and Communication, Nanchang University, Nanchang, China
Yanping Wang: School of Journalism and Communication, Nanchang University, Nanchang, China
Yifan Huang: School of International Relations and Diplomacy, Beijing Foreign Studies University, Beijing, China
Fuyu Chen: Journalism and Information Communication School, Huazhong University of Science and Technology, Wuhan, China
Chiba Makiko: School of Foreign Languages, Zhejiang University of Technology, Zhejiang, China

DOI: https://doi.org/10.7717/peerj-cs.2262
Journal volume & issue: Vol. 10
p. e2262

Abstract

Read online Read online

In crisis management, quickly identifying and helping affected individuals is key, especially when there is limited information about the survivors’ conditions. Traditional emergency systems often face issues with reachability and handling large volumes of requests. Social media has become crucial in disaster response, providing important information and aiding in rescues when standard communication systems fail. Due to the large amount of data generated on social media during emergencies, there is a need for automated systems to process this information effectively and help improve emergency responses, potentially saving lives. Therefore, accurately understanding visual scenes and their meanings is important for identifying damage and obtaining useful information. Our research introduces a framework for detecting damage in social media posts, combining the Bidirectional Encoder Representations from Transformers (BERT) architecture with advanced convolutional processing. This framework includes a BERT-based network for analyzing text and multiple convolutional neural network blocks for processing images. The results show that this combination is very effective, outperforming existing methods in accuracy, recall, and F1 score. In the future, this method could be enhanced by including more types of information, such as human voices or background sounds, to improve its prediction efficiency.

Published in PeerJ Computer Science

ISSN: 2376-5992 (Online)
Publisher: PeerJ Inc.
Country of publisher: United States
LCC subjects: Science: Mathematics: Instruments and machines: Electronic computers. Computer science
Website: https://peerj.com/computer-science/

About the journal

Abstract

Keywords