Scientific Reports (May 2023)

Unsatisfactory reproducibility of interstitial inflammation scoring in allograft kidney biopsy

  • Shun-Chen Huang,
  • Yi-Jia Lin,
  • Mei-Chin Wen,
  • Wei-Chou Lin,
  • Pei-Wei Fang,
  • Peir-In Liang,
  • Hao-Wen Chuang,
  • Hui-Ping Chien,
  • Tai-Di Chen

DOI
https://doi.org/10.1038/s41598-023-33908-3
Journal volume & issue
Vol. 13, no. 1
pp. 1 – 8

Abstract

Read online

Abstract Interstitial inflammation scoring is incorporated into the Banff Classification of Renal Allograft Pathology and is essential for the diagnosis of T-cell mediated rejection. However, its reproducibility, including inter-rater and intra-rater reliabilities, has not been carefully investigated. In this study, eight renal pathologists from different hospitals independently scored 45 kidney allograft biopsies with varying extents of interstitial inflammation. Inter-rater reliabilities and intra-rater reliabilities were investigated by kappa statistics and conditional agreement probabilities. Individual pathologists’ scoring patterns were examined by chi-squared tests and proportions tests. The mean pairwise kappa values for inter-rater reliability were 0.27, 0.30, and 0.26 for the Banff i score, ti score, and i-IFTA, respectively. No rater pair performed consistently better or worse than others on all three scorings. After dichotomizing the scores into two groups (none/mild and moderate/severe inflammation), the averaged conditional agreements ranged from 47.1% to 50.0%. The distributions of the scores differed, but some pathologists persistently scored higher or lower than others. Given the important role of interstitial inflammation scoring in the diagnosis of T-cell mediated rejection, transplant practitioners should be aware of the possible clinical implications of the far-from-optimal reproducibility.