PLoS ONE (Jan 2016)

Agreement between Computerized and Human Assessment of Performance on the Ruff Figural Fluency Test.

  • Martin F Elderson,
  • Sander Pham,
  • Marlise E A van Eersel,
  • LifeLines Cohort Study,
  • Bruce H R Wolffenbuttel,
  • Johan Kok,
  • Ron T Gansevoort,
  • Oliver Tucha,
  • Melanie M van der Klauw,
  • Joris P J Slaets,
  • Gerbrand J Izaks

DOI
https://doi.org/10.1371/journal.pone.0163286
Journal volume & issue
Vol. 11, no. 9
p. e0163286

Abstract

Read online

The Ruff Figural Fluency Test (RFFT) is a sensitive test for nonverbal fluency suitable for all age groups. However, assessment of performance on the RFFT is time-consuming and may be affected by interrater differences. Therefore, we developed computer software specifically designed to analyze performance on the RFFT by automated pattern recognition. The aim of this study was to compare assessment by the new software with conventional assessment by human raters. The software was developed using data from the Lifelines Cohort Study and validated in an independent cohort of the Prevention of Renal and Vascular End Stage Disease (PREVEND) study. The total study population included 1,761 persons: 54% men; mean age (SD), 58 (10) years. All RFFT protocols were assessed by the new software and two independent human raters (criterion standard). The mean number of unique designs (SD) was 81 (29) and the median number of perseverative errors (interquartile range) was 9 (4 to 16). The intraclass correlation coefficient (ICC) between the computerized and human assessment was 0.994 (95%CI, 0.988 to 0.996; p<0.001) and 0.991 (95%CI, 0.990 to 0.991; p<0.001) for the number of unique designs and perseverative errors, respectively. The mean difference (SD) between the computerized and human assessment was -1.42 (2.78) and +0.02 (1.94) points for the number of unique designs and perseverative errors, respectively. This was comparable to the agreement between two independent human assessments: ICC, 0.995 (0.994 to 0.995; p<0.001) and 0.985 (0.982 to 0.988; p<0.001), and mean difference (SD), -0.44 (2.98) and +0.56 (2.36) points for the number of unique designs and perseverative errors, respectively. We conclude that the agreement between the computerized and human assessment was very high and comparable to the agreement between two independent human assessments. Therefore, the software is an accurate tool for the assessment of performance on the RFFT.