IEEE Access (Jan 2024)
Decision Trees in Federated Learning: Current State and Future Opportunities
Abstract
Federated learning (FL) is a distributed machine learning technique that enables multiple decentralized clients to develop a model collaboratively without exchanging their local data. Heightened privacy risks and the recent strict privacy laws make it even more precarious for the gathering and integration of data in a centralized location for full utilization. Federated learning is compatible with established privacy laws like General Data Protection Regulation (GDPR), California Consumer Privacy Act (CCPA), Health Insurance Portability and Accountability Act (HIPAA), and China’s Cybersecurity Law. Further, there are very few scenarios where centralized, properly labeled, and complete data are available. Federated learning provides a way to solve this problem. As a result, much research has been conducted in several areas within the emerging field of FL. This review paper focuses on decision tree-based FL systems due to their desirable properties of interpretability, parallelism, and high performance. We take a closer look at the motivations, design considerations, tree building algorithms, and security mechanisms used for these systems. We also present the various datasets used in these systems, demonstrated application areas, and the evidence of their benefits. The objective of this paper is to provide an informative overview about the characteristics of FL, privacy and security mechanisms used in them, available open source development frameworks for FL, and the decision tree-based systems developed in FL for researchers in academia and system architects in the industry.
Keywords