Journal of Medical Internet Research (Aug 2023)

Effects of Excluding Those Who Report Having “Syndomitis” or “Chekalism” on Data Quality: Longitudinal Health Survey of a Sample From Amazon’s Mechanical Turk

  • Ron D Hays,
  • Nabeel Qureshi,
  • Patricia M Herman,
  • Anthony Rodriguez,
  • Arie Kapteyn,
  • Maria Orlando Edelen

DOI
https://doi.org/10.2196/46421
Journal volume & issue
Vol. 25
p. e46421

Abstract

Read online

BackgroundResearchers have implemented multiple approaches to increase data quality from existing web-based panels such as Amazon’s Mechanical Turk (MTurk). ObjectiveThis study extends prior work by examining improvements in data quality and effects on mean estimates of health status by excluding respondents who endorse 1 or both of 2 fake health conditions (“Syndomitis” and “Chekalism”). MethodsSurvey data were collected in 2021 at baseline and 3 months later from MTurk study participants, aged 18 years or older, with an internet protocol address in the United States, and who had completed a minimum of 500 previous MTurk “human intelligence tasks.” We included questions about demographic characteristics, health conditions (including the 2 fake conditions), and the Patient Reported Outcomes Measurement Information System (PROMIS)-29+2 (version 2.1) preference–based score survey. The 3-month follow-up survey was only administered to those who reported having back pain and did not endorse a fake condition at baseline. ResultsIn total, 15% (996/6832) of the sample endorsed at least 1 of the 2 fake conditions at baseline. Those who endorsed a fake condition at baseline were more likely to identify as male, non-White, younger, report more health conditions, and take longer to complete the survey than those who did not endorse a fake condition. They also had substantially lower internal consistency reliability on the PROMIS-29+2 scales than those who did not endorse a fake condition: physical function (0.69 vs 0.89), pain interference (0.80 vs 0.94), fatigue (0.80 vs 0.92), depression (0.78 vs 0.92), anxiety (0.78 vs 0.90), sleep disturbance (−0.27 vs 0.84), ability to participate in social roles and activities (0.77 vs 0.92), and cognitive function (0.65 vs 0.77). The lack of reliability of the sleep disturbance scale for those endorsing a fake condition was because it includes both positively and negatively worded items. Those who reported a fake condition reported significantly worse self-reported health scores (except for sleep disturbance) than those who did not endorse a fake condition. Excluding those who endorsed a fake condition improved the overall mean PROMIS-29+2 (version 2.1) T-scores by 1-2 points and the PROMIS preference–based score by 0.04. Although they did not endorse a fake condition at baseline, 6% (n=59) of them endorsed at least 1 of them on the 3-month survey and they had lower PROMIS-29+2 score internal consistency reliability and worse mean scores on the 3-month survey than those who did not report having a fake condition. Based on these results, we estimate that 25% (1708/6832) of the MTurk respondents provided careless or dishonest responses. ConclusionsThis study provides evidence that asking about fake health conditions can help to screen out respondents who may be dishonest or careless. We recommend this approach be used routinely in samples of members of MTurk.