Scientific Data (Oct 2024)

A multicenter bladder cancer MRI dataset and baseline evaluation of federated learning in clinical application

  • Kangyang Cao,
  • Yujian Zou,
  • Chang Zhang,
  • Weijing Zhang,
  • Jie Zhang,
  • Guojie Wang,
  • Chu Zhang,
  • Jiegeng Lyu,
  • Yue Sun,
  • Hongyuan Zhang,
  • Bin Huang,
  • Lei Deng,
  • Shuiqing Yang,
  • Jianpeng Li,
  • Bingsheng Huang

DOI
https://doi.org/10.1038/s41597-024-03971-0
Journal volume & issue
Vol. 11, no. 1
pp. 1 – 10

Abstract

Read online

Abstract Bladder cancer (BCa), as the most common malignant tumor of the urinary system, has received significant attention in research on the clinical application of artificial intelligence algorithms. Nevertheless, it has been observed that certain investigations use data from various medical facilities to train models for BCa, which may pose a privacy risk. Given this concern, protecting patient privacy during machine learning algorithm training is a crucial aspect that requires substantial attention. One emerging machine learning paradigm that addresses this concern is federated learning (FL). FL enables multiple entities to collaboratively build machine learning models while preserving data privacy and security. In this study, we present a multicenter BCa magnetic resonance imaging (MRI) dataset. The dataset comprises 275 three-dimensional bladder T2-weighted MRI scans collected from four medical centers, and each scan includes diagnostic pathological labels for muscle invasion and pixel-level annotations of tumor contours. Four FL methods are used to assess the baseline of the dataset for both the task of diagnosing muscle-invasive bladder cancer and automatic bladder tumor lesion segmentation.