Spatio‐temporal dynamic navigation for electric vehicle charging using deep reinforcement learning

Ali Can Erüst; Fatma Yıldız Taşcıkaraoğlu

doi:10.1049/itr2.12588

IET Intelligent Transport Systems (Dec 2024)

Spatio‐temporal dynamic navigation for electric vehicle charging using deep reinforcement learning

Ali Can Erüst,
Fatma Yıldız Taşcıkaraoğlu

Affiliations

Ali Can Erüst: Departmant of Electrical and Electronics Engineering Mugla Sitki Kocman University Mugla Türkiye
Fatma Yıldız Taşcıkaraoğlu: Departmant of Electrical and Electronics Engineering Mugla Sitki Kocman University Mugla Türkiye

DOI: https://doi.org/10.1049/itr2.12588
Journal volume & issue: Vol. 18, no. 12
pp. 2520 – 2531

Abstract

Read online

Abstract This paper considers the real‐time spatio‐temporal electric vehicle charging navigation problem in a dynamic environment by utilizing a shortest path‐based reinforcement learning approach. In a data sharing system including transportation network, an electric vehicle (EV) and EV charging stations (EVCSs), it is aimed to determine the most convenient EVCS and the optimal path for reducing the travel, charging and waiting costs. To estimate the waiting times at EVCSs, Gaussian process regression algorithm is integrated using a real‐time dataset comprising of state‐of‐charge and arrival‐departure times of EVs. The optimization problem is modelled as a Markov decision process with unknown transition probability to overcome the uncertainties arising from time‐varying variables. A recently proposed on‐policy actor–critic method, phasic policy gradient (PPG) which extends the proximal policy optimization algorithm with an auxiliary optimization phase to improve training by distilling features from the critic to the actor network, is used to make EVCS decisions on the network where EV travels through the optimal path from origin node to EVCS by considering dynamic traffic conditions, unit value of EV owner and time‐of‐use charging price. Three case studies are carried out for 24 nodes Sioux‐Falls benchmark network. It is shown that phasic policy gradient achieves an average of 9% better reward compared to proximal policy optimization and the total time decreases by 7–10% when EV owner cost is considered.

Published in IET Intelligent Transport Systems

ISSN: 1751-956X (Print); 1751-9578 (Online)
Publisher: Wiley
Country of publisher: United Kingdom
LCC subjects: Technology: Engineering (General). Civil engineering (General): Transportation engineering; Science: Mathematics: Instruments and machines: Electronic computers. Computer science
Website: https://ietresearch.onlinelibrary.wiley.com/journal/17519578

About the journal

Abstract

Keywords