BMC Medical Research Methodology (Dec 2016)
Inter-rater reliability of the QuIS as an assessment of the quality of staff-inpatient interactions
Abstract
Abstract Background Recent studies of the quality of in-hospital care have used the Quality of Interaction Schedule (QuIS) to rate interactions observed between staff and inpatients in a variety of ward conditions. The QuIS was developed and evaluated in nursing and residential care. We set out to develop methodology for summarising information from inter-rater reliability studies of the QuIS in the acute hospital setting. Methods Staff-inpatient interactions were rated by trained staff observing care delivered during two-hour observation periods. Anticipating the possibility of the quality of care varying depending on ward conditions, we selected wards and times of day to reflect the variety of daytime care delivered to patients. We estimated inter-rater reliability using weighted kappa, κ w , combined over observation periods to produce an overall, summary estimate, κ ^ w $$ {\widehat{\upkappa}}_w $$ . Weighting schemes putting different emphasis on the severity of misclassification between QuIS categories were compared, as were different methods of combining observation period specific estimates. Results Estimated κ ^ w $$ {\widehat{\upkappa}}_w $$ did not vary greatly depending on the weighting scheme employed, but we found simple averaging of estimates across observation periods to produce a higher value of inter-rater reliability due to over-weighting observation periods with fewest interactions. Conclusions We recommend that researchers evaluating the inter-rater reliability of the QuIS by observing staff-inpatient interactions during observation periods representing the variety of ward conditions in which care takes place, should summarise inter-rater reliability by κ w , weighted according to our scheme A4. Observation period specific estimates should be combined into an overall, single summary statistic κ ^ w random $$ {\widehat{\upkappa}}_{w\ random} $$ , using a random effects approach, with κ ^ w random $$ {\widehat{\upkappa}}_{w\ random} $$ , to be interpreted as the mean of the distribution of κ w across the variety of ward conditions. We draw attention to issues in the analysis and interpretation of inter-rater reliability studies incorporating distinct phases of data collection that may generalise more widely.
Keywords