Patterns (Nov 2023)
MERGE: A model for multi-input biomedical federated learning
Abstract
Summary: Driven by the deep learning (DL) revolution, artificial intelligence (AI) has become a fundamental tool for many biomedical tasks, including analyzing and classifying diagnostic images. Imaging, however, is not the only source of information. Tabular data, such as personal and genomic data and blood test results, are routinely collected but rarely considered in DL pipelines. Nevertheless, DL requires large datasets that often must be pooled from different institutions, raising non-trivial privacy concerns. Federated learning (FL) is a cooperative learning paradigm that aims to address these issues by moving models instead of data across different institutions. Here, we present a federated multi-input architecture using images and tabular data as a methodology to enhance model performance while preserving data privacy. We evaluated it on two showcases: the prognosis of COVID-19 and patients’ stratification in Alzheimer’s disease, providing evidence of enhanced accuracy and F1 scores against single-input models and improved generalizability against non-federated models. The bigger picture: Deep learning models must be trained with large datasets, which often requires pooling data from different sites and sources. In research fields dealing with sensitive information subject to data regulations, such as biomedical research, data pooling can generate concerns about data access and sharing across institutions, which can affect performance, energy consumption, privacy, and security. Federated learning is a cooperative learning paradigm that addresses such concerns by sharing models instead of data across different institutions.