BMC Cancer (May 2009)
Inter-observer reproducibility of HER2 immunohistochemical assessment and concordance with fluorescent <it>in situ </it>hybridization (FISH): pathologist assessment compared to quantitative image analysis
Abstract
Abstract Background In breast cancer patients, HER2 overexpression is routinely assessed by immunohistochemistry (IHC) and equivocal cases are subject to fluorescent in situ hybridization (FISH). Our study compares HER2 scoring by histopathologists with automated quantitation of staining, and determines the concordance of IHC scores with FISH results. Methods A tissue microarray was constructed from 1,212 invasive breast carcinoma cases with linked treatment and outcome information. IHC slides were semi-quantitatively scored by two independent pathologists on a range of 0 to 3+, and also analyzed with an Ariol automated system by two operators. 616 cases were scorable by both IHC and FISH. Results Using data from unequivocal positive (3+) or negative (0, 1+) results, both visual and automated scores were highly consistent: there was excellent concordance between two pathologists (kappa = 1.000, 95% CI: 1-1), between two machines (kappa = 1.000, 95% CI: 1-1), and between both visual and both machine scores (kappa = 0.898, 95% CI: 0.775–0.979). Two pathologists successfully distinguished negative, positive and equivocal cases (kappa = 0.929, 95% CI: 0.909–0.946), with excellent agreement with machine 1 scores (kappa = 0.835, 95% CI: 0.806–0.862; kappa = 0.837, 95% CI: 0.81–0.862), and good agreement with machine 2 scores (kappa = 0.698, 95% CI: 0.6723–0.723; kappa = 0.709, 95% CI: 0.684–0.732), whereas the two machines showed good agreement (kappa = 0.806, 95% CI: 0.785–0.826). When comparing categorized IHC scores and FISH results, the agreement was excellent for visual 1 (kappa = 0.814, 95% CI: 0.768–0.856), good for visual 2 (kappa = 0.763, 95% CI: 0.712–0.81) and machine 1 (kappa = 0.665, 95% CI: 0.609–0.718), and moderate for machine 2 (kappa = 0.535, 95% CI: 0.485–0.584). Conclusion A fully automated image analysis system run by an experienced operator can provide results consistent with visual HER2 scoring. Further development of such systems will likely improve the accuracy of detection and categorization of membranous staining, making this technique suitable for use in quality assurance programs and eventually in clinical practice.