The Journal of Pathology: Clinical Research (Mar 2024)
A deep‐learning workflow to predict upper tract urothelial carcinoma protein‐based subtypes from H&E slides supporting the prioritization of patients for molecular testing
Abstract
Abstract Upper tract urothelial carcinoma (UTUC) is a rare and aggressive, yet understudied, urothelial carcinoma (UC). The more frequent UC of the bladder comprises several molecular subtypes, associated with different targeted therapies and overlapping with protein‐based subtypes. However, if and how these findings extend to UTUC remains unclear. Artificial intelligence‐based approaches could help elucidate UTUC's biology and extend access to targeted treatments to a wider patient audience. Here, UTUC protein‐based subtypes were identified, and a deep‐learning (DL) workflow was developed to predict them directly from routine histopathological H&E slides. Protein‐based subtypes in a retrospective cohort of 163 invasive tumors were assigned by hierarchical clustering of the immunohistochemical expression of three luminal (FOXA1, GATA3, and CK20) and three basal (CD44, CK5, and CK14) markers. Cluster analysis identified distinctive luminal (N = 80) and basal (N = 42) subtypes. The luminal subtype mostly included pushing, papillary tumors, whereas the basal subtype diffusely infiltrating, non‐papillary tumors. DL model building relied on a transfer‐learning approach by fine‐tuning a pre‐trained ResNet50. Classification performance was measured via three‐fold repeated cross‐validation. A mean area under the receiver operating characteristic curve of 0.83 (95% CI: 0.67–0.99), 0.8 (95% CI: 0.62–0.99), and 0.81 (95% CI: 0.65–0.96) was reached in the three repetitions. High‐confidence DL‐based predicted subtypes showed significant associations (p < 0.001) with morphological features, i.e. tumor type, histological subtypes, and infiltration type. Furthermore, a significant association was found with programmed cell death ligand 1 (PD‐L1) combined positive score (p < 0.001) and FGFR3 mutational status (p = 0.002), with high‐confidence basal predictions containing a higher proportion of PD‐L1 positive samples and high‐confidence luminal predictions a higher proportion of FGFR3‐mutated samples. Testing of the DL model on an independent cohort highlighted the importance to accommodate histological subtypes. Taken together, our DL workflow can predict protein‐based UTUC subtypes, associated with the presence of targetable alterations, directly from H&E slides.
Keywords