A deep reinforcement learning with dynamic spatio-temporal graph model for solving urban logistics delivery planning problems

Yuanyuan Li; Qingfeng Guan; Junfeng Gu; Xintong Jiang

doi:10.1080/17538947.2024.2376273

International Journal of Digital Earth (Dec 2024)

A deep reinforcement learning with dynamic spatio-temporal graph model for solving urban logistics delivery planning problems

Yuanyuan Li,
Qingfeng Guan,
Junfeng Gu,
Xintong Jiang

Affiliations

Yuanyuan Li: School of Geography and Information Engineering, China University of Geosciences, Wuhan, People’s Republic of China
Qingfeng Guan: School of Geography and Information Engineering, China University of Geosciences, Wuhan, People’s Republic of China
Junfeng Gu: School of Geography and Information Engineering, China University of Geosciences, Wuhan, People’s Republic of China
Xintong Jiang: School of Future Technology (SFT), China University of Geosciences, Wuhan, People’s Republic of China

DOI: https://doi.org/10.1080/17538947.2024.2376273
Journal volume & issue: Vol. 17, no. 1

Abstract

Read online

The urban logistics delivery planning problems are a crucial component of urban spatial decision analysis. Most studies typically focus on traditional urban logistics delivery planning problems and ignore real-time traffic information. With the advancement of urbanization, real-time traffic networks play a critical role. However, previous studies have utilized heuristic methods to solve urban logistics delivery planning with real-time traffic information problems, and few studies have applied deep reinforcement learning methods to tackle this problem. Deep reinforcement learning methods solving traditional logistics delivery planning problems overlook the impact of dynamic spatio-temporal features on route planning. In this study, we propose a new deep reinforcement method called DRLDSTG. The method introduces the dynamic spatio-temporal graph model into a deep reinforcement learning method to capture these dynamic features from urban logistics delivery planning tasks. The actor-critic with maximum entropy method is employed to train the model and determine the optimal policy function. The experimental results indicated that the proposed method can achieve a superior solution with faster computational efficiency compared to commercial software and heuristic methods. Compared to other deep reinforcement learning methods, our method can more effectively learn dynamic spatio-temporal features from environments, demonstrating promising applications in cities.

Published in International Journal of Digital Earth

ISSN: 1753-8947 (Print); 1753-8955 (Online)
Publisher: Taylor & Francis Group
Country of publisher: United Kingdom
LCC subjects: Geography. Anthropology. Recreation: Mathematical geography. Cartography
Website: https://www.tandfonline.com/journals/tjde

About the journal

Abstract

Keywords