Multi-Path Routing Algorithm Based on Deep Reinforcement Learning for SDN

Yi Zhang; Lanxin Qiu; Yangzhou Xu; Xinjia Wang; Shengjie Wang; Agyemang Paul; Zhefu Wu

doi:10.3390/app132212520

Applied Sciences (Nov 2023)

Multi-Path Routing Algorithm Based on Deep Reinforcement Learning for SDN

Yi Zhang,
Lanxin Qiu,
Yangzhou Xu,
Xinjia Wang,
Shengjie Wang,
Agyemang Paul,
Zhefu Wu

Affiliations

Yi Zhang: Information Communication Branch of State Grid Zhejiang Electric Power Co., Hangzhou 310007, China
Lanxin Qiu: Information Communication Branch of State Grid Zhejiang Electric Power Co., Hangzhou 310007, China
Yangzhou Xu: Information Communication Branch of State Grid Zhejiang Electric Power Co., Hangzhou 310007, China
Xinjia Wang: Information Communication Branch of State Grid Zhejiang Electric Power Co., Hangzhou 310007, China
Shengjie Wang: College of Information Engineering, Zhejiang University of Technology, Hangzhou 310023, China
Agyemang Paul: College of Information Engineering, Zhejiang University of Technology, Hangzhou 310023, China
Zhefu Wu: College of Information Engineering, Zhejiang University of Technology, Hangzhou 310023, China

DOI: https://doi.org/10.3390/app132212520
Journal volume & issue: Vol. 13, no. 22
p. 12520

Abstract

Read online

Software-Defined Networking (SDN) enhances network control but faces Distributed Denial of Service (DDoS) attacks due to centralized control and flow-table constraints in network devices. To overcome this limitation, we introduce a multi-path routing algorithm for SDN called Trust-Based Proximal Policy Optimization (TBPPO). TBPPO incorporates a Kullback–Leibler divergence (KL divergence) trust value and a node diversity mechanism as the security assessment criterion, aiming to mitigate issues such as network fluctuations, low robustness, and congestion, with a particular emphasis on countering DDoS attacks. To avoid routing loops, differently from conventional ‘Next Hop’ routing decision methodology, we implemented an enhanced Depth-First Search (DFS) approach involving the pre-computation of path sets, from which we select the best path. To optimize the routing efficiency, we introduced an improved Proximal Policy Optimization (PPO) algorithm based on deep reinforcement learning. This enhanced PPO algorithm focuses on optimizing multi-path routing, considering security, network delay, and variations in multi-path delays. The TBPPO outperforms traditional methods in the Germany-50 evaluation, reducing average delay by 20%, cutting delay variation by 50%, and leading in trust value by 0.5, improving security and routing efficiency in SDN. TBPPO provides a practical and effective solution to enhance SDN security and routing efficiency.

Published in Applied Sciences

ISSN: 2076-3417 (Online)
Publisher: MDPI AG
Country of publisher: Switzerland
LCC subjects: Technology: Engineering (General). Civil engineering (General); Science: Biology (General); Science: Physics; Science: Chemistry
Website: http://www.mdpi.com/journal/applsci

About the journal

Abstract

Keywords