Journal of Pathology Informatics (Jan 2022)
A tool for federated training of segmentation models on whole slide images
Abstract
The largest bottleneck to the development of convolutional neural network (CNN) models in the computational pathology domain is the collection and curation of diverse training datasets. Training CNNs requires large cohorts of image data, and model generalizability is dependent on training data heterogeneity. Including data from multiple centers enhances the generalizability of CNN-based models, but this is hindered by the logistical challenges of sharing medical data. In this paper, we explore the feasibility of training our recently developed cloud-based segmentation tool (Histo-Cloud) using federated learning. Using a dataset of renal tissue biopsies we show that federated training to segment interstitial fibrosis and tubular atrophy (IFTA) using datasets from three institutions is not found to be different from a training by pooling the data on one server when tested on a fourth (holdout) institution’s data. Further, training a model to segment glomeruli for a federated dataset (split by staining) demonstrates similar performance.