Time Difference Penalized Traffic Signal Timing by LSTM Q-Network to Balance Safety and Capacity at Intersections

Lyuchao Liao; Jierui Liu; Xinke Wu; Fumin Zou; Jengshyang Pan; Qi Sun; Shengbo Eben Li; Maolin Zhang

doi:10.1109/ACCESS.2020.2989151

IEEE Access (Jan 2020)

Time Difference Penalized Traffic Signal Timing by LSTM Q-Network to Balance Safety and Capacity at Intersections

Lyuchao Liao,
Jierui Liu,
Xinke Wu,
Fumin Zou,
Jengshyang Pan,
Qi Sun,
Shengbo Eben Li,
Maolin Zhang

Affiliations

Lyuchao Liao: ORCiD; Fujian Provincial Universities Key Laboratory of Industrial Control and Data Analysis, Fujian University of Technology, Fuzhou, China
Jierui Liu: Fujian Provincial Universities Key Laboratory of Industrial Control and Data Analysis, Fujian University of Technology, Fuzhou, China
Xinke Wu: Fujian Provincial Universities Key Laboratory of Industrial Control and Data Analysis, Fujian University of Technology, Fuzhou, China
Fumin Zou: Fujian Provincial Universities Key Laboratory of Industrial Control and Data Analysis, Fujian University of Technology, Fuzhou, China
Jengshyang Pan: Fujian Provincial Universities Key Laboratory of Industrial Control and Data Analysis, Fujian University of Technology, Fuzhou, China
Qi Sun: School of Vehicle and Mobility, Tsinghua University, Beijing, China
Shengbo Eben Li: School of Vehicle and Mobility, Tsinghua University, Beijing, China
Maolin Zhang: Fujian Provincial Universities Key Laboratory of Industrial Control and Data Analysis, Fujian University of Technology, Fuzhou, China

DOI: https://doi.org/10.1109/ACCESS.2020.2989151
Journal volume & issue: Vol. 8
pp. 80086 – 80096

Abstract

Read online

The conflict between limited road resources and rapid car ownership makes the traffic signal timing become a pivotal challenge. Emerging studies have been carried out on adaptive signal timing, but most of them still focus on the throughput of intersections, leaving safety and travel experience unconsidered. This paper proposes a time difference penalized traffic signal timing method by reinforcement learning technique to balance safety and throughput capacity in traffic control system. Firstly, a microcosmic state representation is proposed to integrate the dynamics of both traffic lights and road vehicles, including driver behaviors of lane changing, car-following, previous phase of traffic light and its duration. Secondly, an action space, including 8 signal phases, and a behavior-aware reward function are designed to resist the red-light overflow. Finally, a partial long short term memory (LSTM) network is trained to balance traffic efficiency and traveling experience. In the network training, a parallel sampling method is adopted to obtain experience from multiple environments to accelerate the training convergence in practical application. Experimental results show that the proposed method improves the intersection efficiency up to 14.28% compared to the fixed signal timing and 5.26% compared to DQN while getting rid of red-light overflow time.

Published in IEEE Access

ISSN: 2169-3536 (Online)
Publisher: IEEE
Country of publisher: United States
LCC subjects: Technology: Electrical engineering. Electronics. Nuclear engineering
Website: https://ieeexplore.ieee.org/xpl/RecentIssue.jsp?punumber=6287639

About the journal

Abstract

Keywords