Nature Communications (Nov 2024)

An explainable longitudinal multi-modal fusion model for predicting neoadjuvant therapy response in women with breast cancer

  • Yuan Gao,
  • Sofia Ventura-Diaz,
  • Xin Wang,
  • Muzhen He,
  • Zeyan Xu,
  • Arlene Weir,
  • Hong-Yu Zhou,
  • Tianyu Zhang,
  • Frederieke H. van Duijnhoven,
  • Luyi Han,
  • Xiaomei Li,
  • Anna D’Angelo,
  • Valentina Longo,
  • Zaiyi Liu,
  • Jonas Teuwen,
  • Marleen Kok,
  • Regina Beets-Tan,
  • Hugo M. Horlings,
  • Tao Tan,
  • Ritse Mann

DOI
https://doi.org/10.1038/s41467-024-53450-8
Journal volume & issue
Vol. 15, no. 1
pp. 1 – 17

Abstract

Read online

Abstract Multi-modal image analysis using deep learning (DL) lays the foundation for neoadjuvant treatment (NAT) response monitoring. However, existing methods prioritize extracting multi-modal features to enhance predictive performance, with limited consideration on real-world clinical applicability, particularly in longitudinal NAT scenarios with multi-modal data. Here, we propose the Multi-modal Response Prediction (MRP) system, designed to mimic real-world physician assessments of NAT responses in breast cancer. To enhance feasibility, MRP integrates cross-modal knowledge mining and temporal information embedding strategy to handle missing modalities and remain less affected by different NAT settings. We validated MRP through multi-center studies and multinational reader studies. MRP exhibited comparable robustness to breast radiologists, outperforming humans in predicting pathological complete response in the Pre-NAT phase (ΔAUROC 14% and 10% on in-house and external datasets, respectively). Furthermore, we assessed MRP’s clinical utility impact on treatment decision-making. MRP may have profound implications for enrolment into NAT trials and determining surgery extensiveness.