A Novel Medium Access Policy Based on Reinforcement Learning in Energy-Harvesting Underwater Sensor Networks

Çiğdem Eriş; Ömer Melih Gül; Pınar Sarısaray Bölük

doi:10.3390/s24175791

Sensors (Sep 2024)

A Novel Medium Access Policy Based on Reinforcement Learning in Energy-Harvesting Underwater Sensor Networks

Çiğdem Eriş,
Ömer Melih Gül,
Pınar Sarısaray Bölük

Affiliations

Çiğdem Eriş: Department of Computer Engineering, Bahcesehir University, Istanbul 34353, Turkey
Ömer Melih Gül: Department of Computer Engineering, Bahcesehir University, Istanbul 34353, Turkey
Pınar Sarısaray Bölük: Department of Artificial Intelligence and Data Engineering, Istanbul University, Istanbul 34134, Turkey

DOI: https://doi.org/10.3390/s24175791
Journal volume & issue: Vol. 24, no. 17
p. 5791

Abstract

Read online

Underwater acoustic sensor networks (UASNs) are fundamental assets to enable discovery and utilization of sub-sea environments and have attracted both academia and industry to execute long-term underwater missions. Given the heightened significance of battery dependency in underwater wireless sensor networks, our objective is to maximize the amount of harvested energy underwater by adopting the TDMA time slot scheduling approach to prolong the operational lifetime of the sensors. In this study, we considered the spatial uncertainty of underwater ambient resources to improve the utilization of available energy and examine a stochastic model for piezoelectric energy harvesting. Considering a realistic channel and environment condition, a novel multi-agent reinforcement learning algorithm is proposed. Nodes observe and learn from their choice of transmission slots based on the available energy in the underwater medium and autonomously adapt their communication slots to their energy harvesting conditions instead of relying on the cluster head. In the numerical results, we present the impact of piezoelectric energy harvesting and harvesting awareness on three lifetime metrics. We observe that energy harvesting contributes to 4% improvement in first node dead (FND), 14% improvement in half node dead (HND), and 22% improvement in last node dead (LND). Additionally, the harvesting-aware TDMA-RL method further increases HND by 17% and LND by 38%. Our results show that the proposed method improves in-cluster communication time interval utilization and outperforms traditional time slot allocation methods in terms of throughput and energy harvesting efficiency.

Published in Sensors

ISSN: 1424-8220 (Online)
Publisher: MDPI AG
Country of publisher: Switzerland
LCC subjects: Technology: Chemical technology
Website: http://www.mdpi.com/journal/sensors

About the journal

Abstract

Keywords