BMC Medical Research Methodology (Jun 2020)
Missing-data analysis: socio- demographic, clinical and lifestyle determinants of low response rate on self- reported psychological and nutrition related multi- item instruments in the context of the ATTICA epidemiological study
Abstract
Abstract Background Missing data is a common problem in epidemiological studies, while it becomes more critical, when the missing data concern a multi-item instrument, since lack of information in even one of its items, leads to the inability to calculate the total score of the instrument. The aim was to investigate the socio-demographic, lifestyle and clinical determinants of low response rate in two self- rating multi item scales, estimating the individuals’ nutritional habits and psychological disorders, as well as, to compare different missing data handling techniques regarding the imputation of missing values in this context. Methods The sample from ATTICA epidemiological study was used, with complete baseline information (2001–2002) regarding their demographic characteristics [n = 2194 subjects (1364 men: 64 years old (SD = 12 years) and 830 women: 66 years old (SD = 12 years))]. Adherence to the Mediterranean diet and depressive symptomatology were assessed at baseline, with the MedDietScore scale and the Zung’s Self- rating Depression Scale (SDS), respectively. Logistic and Poisson regression analysis were used, in order to explore the low response’s determinants in each scale. Seven missing data handling techniques were compared in terms of the estimated regression coefficients and their standard errors, under different scenarios of missingness, in the context of a multivariable logistic regression model examining the association of each scale with the participants’ likelihood of being hypertensive. Results Older age, lower educational level, poorer health status and unhealthy lifestyle habits, were found to be significant determinants of high nonresponse rates, both in the MedDietScore scale and the Zung’s SDS. Female participants were more likely to have missing data in the items of the MedDietScore scale, while a significantly higher number of missing items in the depression scale was found for male participants. Concerning the analysis of such data, multiple imputation was found to be the most effective technique, even when the number of missing items was large. Conclusions The present work augments prior evidence that higher non-response to health surveys is significantly affected by responders’ background characteristics, while it gives rise to research towards unrevealed paths behind this claim, especially in the era of nutritional epidemiology.
Keywords