Veterinary Medicine and Science (Jul 2022)

Assessment of agreement using the equine glandular gastric disease grading system in 84 cases

  • Stefanie Pratt,
  • Ian Bowen,
  • Gayle Hallowell,
  • Emma Shipman,
  • Adam Redpath

DOI
https://doi.org/10.1002/vms3.807
Journal volume & issue
Vol. 8, no. 4
pp. 1472 – 1477

Abstract

Read online

Introduction Equine glandular gastric disease (EGGD) is a common condition causing signs of gastric pain although lesions are highly variable in their appearance. The only definitive method to diagnose EGGD ante‐mortem is gastroscopy. The current recommended method for describing these lesions is the European College of Equine Internal Medicine (ECEIM) guidelines; however, repeatability between users is variable. This study aimed to validate the reliability of lesion descriptions using ECEIM consensus guidelines, using four blinded equine internal medicine diplomates. Methods Ninety‐two horses with EGGD with pre‐ and post‐treatment gastroscopy images were identified using the electronic record at a UK equine hospital between 2012 and 2019. Eight horses were excluded due to non‐diagnostic images. Four blinded observers used the recommended grading system to describe images and outcomes. Intraclass correlation coefficients and Krippendorff's alpha were used to determine reliability and agreement, respectively. Results Intraclass correlation coefficient for severity was 0.782 (95% confidence interval [CI] 0.722–0.832), for distribution was 0.671 (95% CI 0.540–0.763), for the descriptor raised was 0.635 (95% CI 0.479–0.741), fibrinosuppurative was 0.745 (95% CI 0.651–0.812), haemorrhagic was 0.648 (95% CI 0.513–0.744), hyperaemic was 0.389 (95% CI 0.232–0.522) and for outcome was 0.677 (95% CI 0.559–0.770). Krippendorff's alpha for severity was 0.466 (95% CI 0.466–0.418), for distribution was 0.304 (95% CI 0.234–0.374), for the descriptor raised was 0.268 (95% CI 0.207–0.329), fibrinosuppurative was 0.406 (95% CI 0.347–0.463), haemorrhagic was 0.287 (95% CI 0.229–0.344), hyperaemic was 0.112 (95% CI 0.034–0.188) and for outcome was 0.315 (95% CI 0.218–0.408). There was moderate reliability determined between observers using intra‐class correlation coefficients and unacceptable agreement determined between observers using Krippendorff's alpha. Discussion These results suggest that the current grading system is not comparable between observers, indicating the need to review the grading system or define more robust criteria.

Keywords