BMC Medical Research Methodology (Jan 2023)

Grouped data with survey revision

  • Chung-Han Liang,
  • Da-Wei Wang,
  • Mei-Lien Pan

DOI
https://doi.org/10.1186/s12874-023-01834-7
Journal volume & issue
Vol. 23, no. 1
pp. 1 – 20

Abstract

Read online

Abstract Introduction Surveys are common research tools, and questionnaires revisions are a common occurrence in longitudinal studies. Revisions can, at times, introduce systematic shifts in measures of interest. We formulate that questionnaire revision are a stochastic process with transition matrices. Thus, revision shifts can be reduced by first estimating these transition matrices, which can be utilized in estimation of interested measures. Materials and method An ideal survey response model is defined by mapping between the true value of a participant’s response to an interval in the grouped data type scale. A population completed surveys multiple times, as modeled with multiple stochastic process. This included stochastic processes related to true values and intervals. While multiple factors contribute to changes in survey responses, here, we explored the method that can mitigate the effects of questionnaire revision. We proposed the Version Alignment Method (VAM), a data preprocessing tool, which can separate the transitions according to revisions from all transitions via solving an optimization problem and using the revision-related transitions to remove the revision effect. To verify VAM, we used simulation data to study the estimation error and a real life MJ dataset containing large amounts of long-term questionnaire responses with several questionnaire revisions to study its feasibility. Result We compared the difference of the annual average between consecutive years. Without adjustment, the difference is 0.593 when the revision occurred, while VAM brought it down to 0.115, where difference between years without revision was in the 0.005, 0.125 range. Furthermore, our method rendered the responses to the same set of intervals, thus comparing the relative frequency of items before and after revisions became possible. The average estimation error in L infinity was 0.0044 which occupied the 95% CI which was constructed by bootstrap analysis. Conclusion Questionnaire revisions can induce different response bias and information loss, thus causing inconsistencies in the estimated measures. Conventional methods can only partly remedy this issue. Our proposal, VAM, can estimate the aggregate difference of all revision-related systematic errors and can reduce the differences, thus reducing inconsistencies in the final estimations of longitudinal studies.

Keywords