Putting ChatGPT vision (GPT-4V) to the test: risk perception in traffic images

Tom Driessen; Dimitra Dodou; Pavlo Bazilinskyy; Joost de Winter

doi:10.1098/rsos.231676

Royal Society Open Science (May 2024)

Putting ChatGPT vision (GPT-4V) to the test: risk perception in traffic images

Tom Driessen,
Dimitra Dodou,
Pavlo Bazilinskyy,
Joost de Winter

Affiliations

Tom Driessen: Delft University of Technology, Delft, Zuid-Holland, The Netherlands
Dimitra Dodou: Delft University of Technology, Delft, Zuid-Holland, The Netherlands
Pavlo Bazilinskyy: Eindhoven University of Technology, Eindhoven, Noord-Brabant, The Netherlands
Joost de Winter: Delft University of Technology, Delft, Zuid-Holland, The Netherlands

DOI: https://doi.org/10.1098/rsos.231676
Journal volume & issue: Vol. 11, no. 5

Abstract

Read online

Vision-language models are of interest in various domains, including automated driving, where computer vision techniques can accurately detect road users, but where the vehicle sometimes fails to understand context. This study examined the effectiveness of GPT-4V in predicting the level of ‘risk' in traffic images as assessed by humans. We used 210 static images taken from a moving vehicle, each previously rated by approximately 650 people. Based on psychometric construct theory and using insights from the self-consistency prompting method, we formulated three hypotheses: (i) repeating the prompt under effectively identical conditions increases validity, (ii) varying the prompt text and extracting a total score increases validity compared to using a single prompt, and (iii) in a multiple regression analysis, the incorporation of object detection features, alongside the GPT-4V-based risk rating, significantly contributes to improving the model's validity. Validity was quantified by the correlation coefficient with human risk scores, across the 210 images. The results confirmed the three hypotheses. The eventual validity coefficient was r = 0.83, indicating that population-level human risk can be predicted using AI with a high degree of accuracy. The findings suggest that GPT-4V must be prompted in a way equivalent to how humans fill out a multi-item questionnaire.

Published in Royal Society Open Science

ISSN: 2054-5703 (Online)
Publisher: The Royal Society
Country of publisher: United Kingdom
LCC subjects: Science
Website: https://royalsocietypublishing.org/journal/rsos

About the journal

Abstract

Keywords