Results in Engineering (Dec 2024)
A comprehensive analysis of model poisoning attacks in federated learning for autonomous vehicles: A benchmark study
Abstract
Due to the increase in data regulations amid rising privacy concerns, the machine learning (ML) community has proposed a novel distributed training paradigm called federated learning (FL). FL enables untrusted groups of clients to train collaboratively on an FL model without the need to share private data. The rise of connected vehicles has paved the way for a new era of data-driven traffic management, but it also exposes vulnerabilities to cyber attacks that threaten safety and security. One such security vulnerability is malicious clients who can upload “poisoned” updates during training, causing the FL model's performance to degrade, potentially resulting in catastrophic outcomes.This paper presents a thorough benchmarking study designed to critically analyse and evaluate the effectiveness of Byzantine-robust aggregations as a method to counter state-of-the-art untargeted model poisoning attacks, using an Autonomous Vehicles (AV) benchmark dataset. The research objectives are: (1) to assess the vulnerability of Byzantine-robust aggregations against model poisoning attacks; (2) to evaluate the impact of model poisoning attacks using different practical scenarios involving changing the vector perturbations and data distributions across diverse datasets; and (3) to understand the scale of degradation in performance and efficacy during attacks involving malicious clients. Additionally, this study tests the commonly held belief that Independent and Identically Distributed (IID) data distribution is universally more secure than non-IID in different FL scenarios. To address these objectives, we conduct extensive experiments using: (1) three benchmark datasets of different sizes sourced from two different domains to simulate heterogeneous statistics in real-world scenarios (IID, non-IID, and imbalanced non-IID); and (2) two federated settings (cross-device and cross-silo) with realistic threat models, adversarial capabilities, and FL parameters. One of the main experimental results is that client-selection strategies in cross-device settings can offer a simple yet robust defense. Finally, conclusions and findings are set out (some of which contradict claims made in previous studies) with recommendations for potential future directions in the critical domain.