A Comprehensive Analysis of a Social Intelligence Dataset and Response Tendencies Between Large Language Models (LLMs) and Humans

Erika Mori; Yue Qiu; Hirokatsu Kataoka; Yoshimitsu Aoki

doi:10.3390/s25020477

Sensors (Jan 2025)

A Comprehensive Analysis of a Social Intelligence Dataset and Response Tendencies Between Large Language Models (LLMs) and Humans

Erika Mori,
Yue Qiu,
Hirokatsu Kataoka,
Yoshimitsu Aoki

Affiliations

Erika Mori: National Institute of Advanced Industrial Science and Technology (AIST), Tsukuba 305-8560, Japan
Yue Qiu: National Institute of Advanced Industrial Science and Technology (AIST), Tsukuba 305-8560, Japan
Hirokatsu Kataoka: National Institute of Advanced Industrial Science and Technology (AIST), Tsukuba 305-8560, Japan
Yoshimitsu Aoki: Department of Electronics and Electrical Engineering, Faculty of Science and Technology, Keio University, 3-14-1, Hiyoshi, Kohoku-ku, Yokohama 223-8522, Japan

DOI: https://doi.org/10.3390/s25020477
Journal volume & issue: Vol. 25, no. 2
p. 477

Abstract

Read online

In recent years, advancements in the interaction and collaboration between humans and have garnered significant attention. Social intelligence plays a crucial role in facilitating natural interactions and seamless communication between humans and Artificial Intelligence (AI). To assess AI’s ability to understand human interactions and the components necessary for such comprehension, datasets like Social-IQ have been developed. However, these datasets often rely on a simplistic question-and-answer format and lack justifications for the provided answers. Furthermore, existing methods typically produce direct answers by selecting from predefined choices without generating intermediate outputs, which hampers interpretability and reliability. To address these limitations, we conducted a comprehensive evaluation of AI methods on a video-based Question Answering (QA) benchmark focused on human interactions, leveraging additional annotations related to human responses. Our analysis highlights significant differences between human and AI response patterns and underscores critical shortcomings in current benchmarks. We anticipate that these findings will guide the creation of more advanced datasets and represent an important step toward achieving natural communication between humans and AI.

Published in Sensors

ISSN: 1424-8220 (Online)
Publisher: MDPI AG
Country of publisher: Switzerland
LCC subjects: Technology: Chemical technology
Website: http://www.mdpi.com/journal/sensors

About the journal

Abstract

Keywords