Multi-Agent Reinforcement-Learning-Based Time-Slotted Channel Hopping Medium Access Control Scheduling Scheme

Huiung Park; Haeyong Kim; Seon-Tae Kim; Pyeongsoo Mah

doi:10.1109/ACCESS.2020.3010575

IEEE Access (Jan 2020)

Multi-Agent Reinforcement-Learning-Based Time-Slotted Channel Hopping Medium Access Control Scheduling Scheme

Huiung Park,
Haeyong Kim,
Seon-Tae Kim,
Pyeongsoo Mah

Affiliations

Huiung Park: ORCiD; Electronics and Telecommunication Research Institute (ETRI), Daejeon, South Korea
Haeyong Kim: ORCiD; Electronics and Telecommunication Research Institute (ETRI), Daejeon, South Korea
Seon-Tae Kim: Electronics and Telecommunication Research Institute (ETRI), Daejeon, South Korea
Pyeongsoo Mah: Electronics and Telecommunication Research Institute (ETRI), Daejeon, South Korea

DOI: https://doi.org/10.1109/ACCESS.2020.3010575
Journal volume & issue: Vol. 8
pp. 139727 – 139736

Abstract

Read online

Time-slotted channel hopping (TSCH) is a medium access control technology that realizes collision-free wireless network communication by coordinating the media access time and channel of network devices. Although existing TSCH schedulers have suitable application scenarios for each, they are less versatile. Scheduling without collisions inevitably lowers the throughput, whereas contention-based scheduling achieves high-throughput but it may induces to frequent collisions in densely deployed networks. Therefore, a TSCH scheduler that can be used universally, regardless of the topology and data collection characteristics of the application scenario, is required to overcome these shortcomings. To this end, a multi-agent reinforcement learning (RL)-based TSCH scheduling scheme that allows contention but minimizes collisions is proposed in this study. RL is a machine-learning method that gradually improves actions to solve problems. One specific RL method, Q-Learning (QL), was used in the scheme to enable the TSCH scheduler to become a QL agent that learns the best transmission slot. To improve the QL performance, reward functions tailored for the TSCH scheduler were developed. Because the QL agent runs on multiple nodes concurrently, changes in the TSCH schedule of one node also affect the performance of the TSCH schedules of other nodes. The use of action peeking is proposed to overcome this non-stationarity problem in a multi-agent environment. The experimental results indicate that the TSCH scheduler consistently performs well in various types of applications, compared to other schedulers.

Published in IEEE Access

ISSN: 2169-3536 (Online)
Publisher: IEEE
Country of publisher: United States
LCC subjects: Technology: Electrical engineering. Electronics. Nuclear engineering
Website: https://ieeexplore.ieee.org/xpl/RecentIssue.jsp?punumber=6287639

About the journal

Abstract

Keywords