Journal of Orthopaedic Surgery and Research (Nov 2018)
Low inter-observer agreement among experienced shoulder surgeons assessing overstuffing of glenohumeral resurfacing hemiarthroplasty based on plain radiographs
Abstract
Abstract Background In a clinical setting, a visual evaluation of post-implant radiographs is often used to assess the restoration of glenohumeral joint anatomy after resurfacing hemiarthroplasty and is a part of the decision-making process, in combination with other parameters, when evaluating patients with inferior clinical results. However, the reliability of this method of visual evaluation has not been reported. The aim of this study was to investigate the inter- and intra-observer agreement among experienced shoulder surgeons assessing overstuffing, implant positioning, and size following resurfacing hemiarthroplasty using plain standardized radiographs. Methods Six experienced shoulder surgeons independently classified implant inclination, size of the implant and if the joint seemed overstuffed, in 219 cases of post-implant radiographs. All cases were classified twice 3 weeks apart. Only radiographs with an anterior-posterior projection with a freely visible joint space were used. Non-weighted Cohen’s kappa values were calculated for each coder pair and the mean used as an estimate of the overall inter-observer agreement. Results The overall inter-observer agreement for implant size (kappa, 0.48 and 0.41) and inclination angle was moderate in both rounds (kappa, 0.46 and 0.44), but only a fair agreement was found concerning the evaluation for stuffing of the joint (kappa, 0.24 and 0.28). Intra-observer agreement for implant size and stuffing ranged from fair to substantial while the agreement for inclination was moderate to substantial. Conclusions Our results indicate that a visual evaluation of plain radiographs may be inadequate to evaluate overstuffing, implant positioning, and size following resurfacing hemiarthroplasty using plain standardized radiographs. Future studies may contribute to elucidate whether reliability increases if consensus on clear definitions and standardized methods of evaluation is made.