Efficiently approaching vertical federated learning by combining data reduction and conditional computation techniques

Francesco Folino; Gianluigi Folino; Francesco Sergio Pisani; Luigi Pontieri; Pietro Sabatino

doi:10.1186/s40537-024-00933-6

Journal of Big Data (May 2024)

Efficiently approaching vertical federated learning by combining data reduction and conditional computation techniques

Francesco Folino,
Gianluigi Folino,
Francesco Sergio Pisani,
Luigi Pontieri,
Pietro Sabatino

Affiliations

Francesco Folino: ICAR-CNR
Gianluigi Folino: ICAR-CNR
Francesco Sergio Pisani: ICAR-CNR
Luigi Pontieri: ICAR-CNR
Pietro Sabatino: ICAR-CNR

DOI: https://doi.org/10.1186/s40537-024-00933-6
Journal volume & issue: Vol. 11, no. 1
pp. 1 – 37

Abstract

Read online

Abstract In this paper, a framework based on a sparse Mixture of Experts (MoE) architecture is proposed for the federated learning and application of a distributed classification model in domains (like cybersecurity and healthcare) where different parties of the federation store different subsets of features for a number of data instances. The framework is designed to limit the risk of information leakage and computation/communication costs in both model training (through data sampling) and application (leveraging the conditional-computation abilities of sparse MoEs). Experiments on real data have shown the proposed approach to ensure a better balance between efficiency and model accuracy, compared to other VFL-based solutions. Notably, in a real-life cybersecurity case study focused on malware classification (the KronoDroid dataset), the proposed method surpasses competitors even though it utilizes only 50% and 75% of the training set, which is fully utilized by the other approaches in the competition. This method achieves reductions in the rate of false positives by 16.9% and 18.2%, respectively, and also delivers satisfactory results on the other evaluation metrics. These results showcase our framework’s potential to significantly enhance cybersecurity threat detection and prevention in a collaborative yet secure manner.

Published in Journal of Big Data

ISSN: 2196-1115 (Online)
Publisher: SpringerOpen
Country of publisher: United Kingdom
LCC subjects: Technology: Electrical engineering. Electronics. Nuclear engineering: Electronics: Computer engineering. Computer hardware; Technology: Technology (General): Industrial engineering. Management engineering: Information technology; Science: Mathematics: Instruments and machines: Electronic computers. Computer science
Website: https://journalofbigdata.springeropen.com

About the journal

Abstract

Keywords