Predicting Intentions of Pedestrians from 2D Skeletal Pose Sequences with a Representation-Focused Multi-Branch Deep Learning Network

Joseph Gesnouin; Steve Pechberti; Guillaume Bresson; Bogdan Stanciulescu; Fabien Moutarde

doi:10.3390/a13120331

Algorithms (Dec 2020)

Predicting Intentions of Pedestrians from 2D Skeletal Pose Sequences with a Representation-Focused Multi-Branch Deep Learning Network

Joseph Gesnouin,
Steve Pechberti,
Guillaume Bresson,
Bogdan Stanciulescu,
Fabien Moutarde

Affiliations

Joseph Gesnouin: Institut VEDECOM—Versailles, 78000 Versailles, France
Steve Pechberti: Institut VEDECOM—Versailles, 78000 Versailles, France
Guillaume Bresson: Institut VEDECOM—Versailles, 78000 Versailles, France
Bogdan Stanciulescu: Centre de Robotique, MINES ParisTech, Université PSL, 75006 Paris, France
Fabien Moutarde: Centre de Robotique, MINES ParisTech, Université PSL, 75006 Paris, France

DOI: https://doi.org/10.3390/a13120331
Journal volume & issue: Vol. 13, no. 12
p. 331

Abstract

Read online

Understanding the behaviors and intentions of humans is still one of the main challenges for vehicle autonomy. More specifically, inferring the intentions and actions of vulnerable actors, namely pedestrians, in complex situations such as urban traffic scenes remains a difficult task and a blocking point towards more automated vehicles. Answering the question “Is the pedestrian going to cross?” is a good starting point in order to advance in the quest to the fifth level of autonomous driving. In this paper, we address the problem of real-time discrete intention prediction of pedestrians in urban traffic environments by linking the dynamics of a pedestrian’s skeleton to an intention. Hence, we propose SPI-Net (Skeleton-based Pedestrian Intention network): a representation-focused multi-branch network combining features from 2D pedestrian body poses for the prediction of pedestrians’ discrete intentions. Experimental results show that SPI-Net achieved 94.4% accuracy in pedestrian crossing prediction on the JAAD data set while being efficient for real-time scenarios since SPI-Net can reach around one inference every 0.25 ms on one GPU (i.e., RTX 2080ti), or every 0.67 ms on one CPU (i.e., Intel Core i7 8700K).

Published in Algorithms

ISSN: 1999-4893 (Online)
Publisher: MDPI AG
Country of publisher: Switzerland
LCC subjects: Technology: Technology (General): Industrial engineering. Management engineering; Science: Mathematics: Instruments and machines: Electronic computers. Computer science
Website: https://www.mdpi.com/journal/algorithms

About the journal

Abstract

Keywords