A Temporal Deep Q Learning for Optimal Load Balancing in Software-Defined Networks

Aakanksha Sharma; Venki Balasubramanian; Joarder Kamruzzaman

doi:10.3390/s24041216

Sensors (Feb 2024)

A Temporal Deep Q Learning for Optimal Load Balancing in Software-Defined Networks

Aakanksha Sharma,
Venki Balasubramanian,
Joarder Kamruzzaman

Affiliations

Aakanksha Sharma: Melbourne Institute of Technology (MIT), Melbourne, VIC 3000, Australia
Venki Balasubramanian: Institute of Innovation, Science and Sustainability, Federation University Australia, Ballarat, VIC 3350, Australia
Joarder Kamruzzaman: Institute of Innovation, Science and Sustainability, Federation University Australia, Ballarat, VIC 3350, Australia

DOI: https://doi.org/10.3390/s24041216
Journal volume & issue: Vol. 24, no. 4
p. 1216

Abstract

Read online

With the rapid advancement of the Internet of Things (IoT), there is a global surge in network traffic. Software-Defined Networks (SDNs) provide a holistic network perspective, facilitating software-based traffic analysis, and are more suitable to handle dynamic loads than a traditional network. The standard SDN architecture control plane has been designed for a single controller or multiple distributed controllers; however, a logically centralized single controller faces severe bottleneck issues. Most proposed solutions in the literature are based on the static deployment of multiple controllers without the consideration of flow fluctuations and traffic bursts, which ultimately leads to a lack of load balancing among controllers in real time, resulting in increased network latency. Moreover, some methods addressing dynamic controller mapping in multi-controller SDNs consider load fluctuation and latency but face controller placement problems. Earlier, we proposed priority scheduling and congestion control algorithm (eSDN) and dynamic mapping of controllers for dynamic SDN (dSDN) to address this issue. However, the future growth of IoT is unpredictable and potentially exponential; to accommodate this futuristic trend, we need an intelligent solution to handle the complexity of growing heterogeneous devices and minimize network latency. Therefore, this paper continues our previous research and proposes temporal deep Q learning in the dSDN controller. A Temporal Deep Q learning Network (tDQN) serves as a self-learning reinforcement-based model. The agent in the tDQN learns to improve decision-making for switch-controller mapping through a reward–punish scheme, maximizing the goal of reducing network latency during the iterative learning process. Our approach—tDQN—effectively addresses dynamic flow mapping and latency optimization without increasing the number of optimally placed controllers. A multi-objective optimization problem for flow fluctuation is formulated to divert the traffic to the best-suited controller dynamically. Extensive simulation results with varied network scenarios and traffic show that the tDQN outperforms traditional networks, eSDNs, and dSDNs in terms of throughput, delay, jitter, packet delivery ratio, and packet loss.

Published in Sensors

ISSN: 1424-8220 (Online)
Publisher: MDPI AG
Country of publisher: Switzerland
LCC subjects: Technology: Chemical technology
Website: http://www.mdpi.com/journal/sensors

About the journal

Abstract

Keywords