The analytical and clinical validity of AI algorithms to score TILs in TNBC: can we use different machine learning models interchangeably?Research in context

Joan Martínez Vidal; Nikos Tsiknakis; Johan Staaf; Ana Bosch; Anna Ehinger; Emma Nimeus; Roberto Salgado; Yalai Bai; David L. Rimm; Johan Hartman; Balazs Acs

EClinicalMedicine (Dec 2024)

The analytical and clinical validity of AI algorithms to score TILs in TNBC: can we use different machine learning models interchangeably?Research in context

Joan Martínez Vidal,
Nikos Tsiknakis,
Johan Staaf,
Ana Bosch,
Anna Ehinger,
Emma Nimeus,
Roberto Salgado,
Yalai Bai,
David L. Rimm,
Johan Hartman,
Balazs Acs

Affiliations

Joan Martínez Vidal: Department of Oncology and Pathology, Karolinska Institutet, Stockholm, Sweden
Nikos Tsiknakis: Department of Oncology and Pathology, Karolinska Institutet, Stockholm, Sweden
Johan Staaf: Division of Oncology, Department of Clinical Sciences Lund, Lund University, Medicon Village, SE-22381, Lund, Sweden
Ana Bosch: Division of Oncology, Department of Clinical Sciences Lund, Lund University, Medicon Village, SE-22381, Lund, Sweden; Department of Hematology, Oncology and Radiation Physics, Region Skåne, Lund, Sweden
Anna Ehinger: Department of Genetics, Pathology and Molecular Diagnostics, Laboratory Medicine, Region Skåne, Lund, Sweden
Emma Nimeus: Division of Oncology, Department of Clinical Sciences Lund, Lund University, Medicon Village, SE-22381, Lund, Sweden; Division of Surgery, Department of Clinical Sciences, Lund University, Lund, Sweden; Department of Surgery, Skåne University Hospital, Malmö, Sweden
Roberto Salgado: Department of Pathology, GZA-ZNA Hospitals, Antwerp, Belgium; Division of Research, Peter MacCallum Cancer Centre, Melbourne, Australia
Yalai Bai: Department of Pathology, Yale School of Medicine, New Haven, CT, USA
David L. Rimm: Department of Pathology, Yale School of Medicine, New Haven, CT, USA; Department of Internal Medicine (Medical Oncology), Yale University School of Medicine, New Haven, CT, USA
Johan Hartman: Department of Oncology and Pathology, Karolinska Institutet, Stockholm, Sweden; Department of Clinical Pathology and Cancer Diagnostics, Karolinska University Hospital, Stockholm, Sweden
Balazs Acs: Department of Oncology and Pathology, Karolinska Institutet, Stockholm, Sweden; Department of Clinical Pathology and Cancer Diagnostics, Karolinska University Hospital, Stockholm, Sweden; Corresponding author. Department of Oncology and Pathology, Karolinska Institutet, Bioclinicum NKS J5:20, Solnavägen 30, Solna 171 64, Sweden.

Journal volume & issue: Vol. 78
p. 102928

Abstract

Read online

Summary: Background: Pathologist-read tumor-infiltrating lymphocytes (TILs) have showcased their predictive and prognostic potential for early and metastatic triple-negative breast cancer (TNBC) but it is still subject to variability. Artificial intelligence (AI) is a promising approach toward eliminating variability and objectively automating TILs assessment. However, demonstrating robust analytical and prognostic validity is the key challenge currently preventing their integration into clinical workflows. Methods: We evaluated the impact of ten AI models on TILs scoring, emphasizing their distinctions in TILs analytical and prognostic validity. Several AI-based TILs scoring models (seven developed and three previously validated AI models) were tested in a retrospective analytical cohort and in an independent prospective cohort to compare prognostic validation against invasive disease-free survival endpoint with 4 years median follow-up. The development and analytical validity set consisted of diagnostic tissue slides of 79 women with surgically resected primary invasive TNBC tumors diagnosed between 2012 and 2016 from the Yale School of Medicine. An independent set comprising of 215 TNBC patients from Sweden diagnosed between 2010 and 2015, was used for testing prognostic validity. Findings: A significant difference in analytical validity (Spearman's r = 0.63–0.73, p < 0.001) is highlighted across AI methodologies and training strategies. Interestingly, the prognostic performance of digital TILs is demonstrated for eight out of ten AI models, even less extensively trained ones, with similar and overlapping hazard ratios (HR) in the external validation cohort (Cox regression analysis based on IDFS-endpoint, HR = 0.40–0.47; p < 0.004). Interpretation: The demonstrated prognostic validity for most of the AI TIL models can be attributed to the intrinsic robustness of host anti-tumor immunity (measured by TILs) as a biomarker. However, the discrepancies between AI models should not be overlooked; rather, we believe that there is a critical need for an accessible, large, multi-centric dataset that will serve as a benchmark ensuring the comparability and reliability of different AI tools in clinical implementation. Funding: Nikos Tsiknakis is supported by the Swedish Research Council (Grant Number 2021-03061, Theodoros Foukakis). Balazs Acs is supported by The Swedish Society for Medical Research (Svenska Sällskapet för Medicinsk Forskning) postdoctoral grant. Roberto Salgado is supported by a grant from Breast Cancer Research Foundation (BCRF).

Published in EClinicalMedicine

ISSN: 2589-5370 (Online)
Publisher: Elsevier
Country of publisher: United Kingdom
LCC subjects: Medicine: Medicine (General)
Website: https://www.thelancet.com/journals/eclinm/home

About the journal

Abstract

Keywords