Reinforcement Learning Based Stochastic Shortest Path Finding in Wireless Sensor Networks

Wenwen Xia; Chong Di; Haonan Guo; Shenghong Li

doi:10.1109/ACCESS.2019.2950055

IEEE Access (Jan 2019)

Reinforcement Learning Based Stochastic Shortest Path Finding in Wireless Sensor Networks

Wenwen Xia,
Chong Di,
Haonan Guo,
Shenghong Li

Affiliations

Wenwen Xia: ORCiD; School of Cyber Security, Shanghai Jiao Tong University, Shanghai, China
Chong Di: ORCiD; School of Cyber Security, Shanghai Jiao Tong University, Shanghai, China
Haonan Guo: ORCiD; School of Cyber Security, Shanghai Jiao Tong University, Shanghai, China
Shenghong Li: School of Cyber Security, Shanghai Jiao Tong University, Shanghai, China

DOI: https://doi.org/10.1109/ACCESS.2019.2950055
Journal volume & issue: Vol. 7
pp. 157807 – 157817

Abstract

Read online

Many factors influence the connection states between nodes of wireless sensor networks, such as physical distance, and the network load, making the network's edge length dynamic in abundant scenarios. This dynamic property makes the network essentially form a graph with stochastic edge lengths. In this paper, we study the stochastic shortest path problem on a directional graph with stochastic edge lengths, using reinforcement learning algorithms. we regard each edge length as a random variable following unknown probability distribution and aim to find the stochastic shortest path on this stochastic graph. We evaluate the performance of path-finding algorithms using regret, which represents the cumulative reward difference between the practical path-finding algorithm and the optimal strategy that chooses the global stochastic shortest path every time. We model the path-finding procedure as a Markov decision process and propose two online path-finding algorithms: QSSP algorithm and SARSASSP algorithm, both combined with specifically-devised average reward mechanism. We justify the convergence property and correctness of the proposed algorithms theoretically. Experiments conducted on two benchmark graphs illustrate the superior performance of the proposed QSSP algorithm which outperforms the SARSASSP algorithm and other competitive algorithms about the regret metric.

Published in IEEE Access

ISSN: 2169-3536 (Online)
Publisher: IEEE
Country of publisher: United States
LCC subjects: Technology: Electrical engineering. Electronics. Nuclear engineering
Website: https://ieeexplore.ieee.org/xpl/RecentIssue.jsp?punumber=6287639

About the journal

Abstract

Keywords