Journal of Pathology Informatics (Jan 2022)

Immunohistochemistry scoring of breast tumor tissue microarrays: A comparison study across three software applications

  • Gabrielle M. Baker,
  • Vanessa C. Bret-Mounet,
  • Tengteng Wang,
  • Mitko Veta,
  • Hanqiao Zheng,
  • Laura C. Collins,
  • A. Heather Eliassen,
  • Rulla M. Tamimi,
  • Yujing J. Heng

Journal volume & issue
Vol. 13
p. 100118

Abstract

Read online

Digital pathology can efficiently assess immunohistochemistry (IHC) data on tissue microarrays (TMAs). Yet, it remains important to evaluate the comparability of the data acquired by different software applications and validate it against pathologist manual interpretation. In this study, we compared the IHC quantification of 5 clinical breast cancer biomarkers—estrogen receptor (ER), progesterone receptor (PR), human epidermal growth factor receptor 2 (HER2), epidermal growth factor receptor (EGFR), and cytokeratin 5/6 (CK5/6)—across 3 software applications (Definiens Tissue Studio, inForm, and QuPath) and benchmarked the results to pathologist manual scores.IHC expression for each marker was evaluated across 4 TMAs consisting of 935 breast tumor tissue cores from 367 women within the Nurses’ Health Studies; each women contributing three 0.6-mm cores. The correlation and agreement between manual and software-derived results were primarily assessed using Spearman’s ρ, percentage agreement, and area under the curve (AUC).At the TMA core-level, the correlations between manual and software-derived scores were the highest for HER2 (ρ ranging from 0.75 to 0.79), followed by ER (0.69–0.71), PR (0.67–0.72), CK5/6 (0.43–0.47), and EGFR (0.38–0.45). At the case-level, there were good correlations between manual and software-derived scores for all 5 markers (ρ ranging from 0.43 to 0.82), where QuPath had the highest correlations. Software-derived scores were highly comparable to each other (ρ ranging from 0.80 to 0.99). The average percentage agreements between manual and software-derived scores were excellent for ER (90.8%–94.5%) and PR (78.2%–85.2%), moderate for HER2 (65.4%–77.0%), highly variable for EGFR (48.2%–82.8%), and poor for CK5/6 (22.4%–45.0%). All AUCs across markers and software applications were ≥0.83.The 3 software applications were highly comparable to each other and to manual scores in quantifying these 5 markers. QuPath consistently produced the best performance, indicating this open-source software is an excellent alternative for future use.

Keywords