Intelligent Systems with Applications (May 2022)
Leveraging asynchronous federated learning to predict customers financial distress
Abstract
In recent years, as economic stability is shaking, and the unemployment rate is growing high due to the COVID-19 effect, assigning credit scoring by predicting consumers’ financial conditions has become more crucial. The conventional machine learning (ML) and deep learning approaches need to share customer’s sensitive information with an external credit bureau to generate a prediction model that opens up the door of privacy leakage. A recently invented privacy-preserving distributed ML scheme referred to as Federated learning (FL) enables generating a target model without sharing local information through on-device model training on edge resources. In this paper, we propose an FL-based application to predict customers’ financial issues by constructing a global learning model that is evolved based on the local models of the distributed agents. The local models are generated by the network agents using their on-device data and local resources. We used the FL concept because the learning strategy does not require sharing any data with the server or any other agent that ensures the preservation of customers’ sensitive data. To that end, we enable partial works from the weak agents that eliminate the issue if the model convergence is retarded due to straggler agents. We also leverage asynchronous FL that cut off the extra waiting time during global model generation. We simulated the performance of our FL model considering a popular dataset, Give me Some Credit (Freshcorn, 2017). We evaluated our proposed method considering a a different number of stragglers and setting up various computational tasks (e.g., local epoch, batch size), and simulated the training loss and testing accuracy of the prediction model. Finally, we compared the F1-score of our proposed model with the existing centralized and decentralized approaches. Our results show that our proposed model achieves an almost identical F1-score as like centralized model even when we set up a skew-level of more than 80% and outperforms the state-of-the-art FL models by obtaining an average of 5∼6% higher accuracy when we have resource-constrained agents within a learning environment.