An exploratory analysis of missing data from the Royal Bank of Canada (RBC) Learn to Play – Canadian Assessment of Physical Literacy (CAPL) project

BMC Public Health. 2018;18(S2):1-9 DOI 10.1186/s12889-018-5901-z

 

Journal Homepage

Journal Title: BMC Public Health

ISSN: 1471-2458 (Online)

Publisher: BMC

LCC Subject Category: Medicine: Public aspects of medicine

Country of publisher: United Kingdom

Language of fulltext: English

Full-text formats available: PDF, HTML

 

AUTHORS

Christine Delisle Nyström (Healthy Active Living and Obesity (HALO) Research Group, Children’s Hospital of Eastern Ontario Research Institute)
Joel D. Barnes (Healthy Active Living and Obesity (HALO) Research Group, Children’s Hospital of Eastern Ontario Research Institute)
Mark S. Tremblay (Healthy Active Living and Obesity (HALO) Research Group, Children’s Hospital of Eastern Ontario Research Institute)

EDITORIAL INFORMATION

Open peer review

Editorial Board

Instructions for authors

Time From Submission to Publication: 18 weeks

 

Abstract | Full Text

Abstract Background Physical literacy comprises a range of tests over four domains (Physical Competence, Daily Behaviour, Motivation and Confidence, and Knowledge and Understanding). The patterns of missing data in large field test batteries such as those for physical literacy are largely unknown. Therefore, the aim of this paper was to explore the patterns and possible reasons for missing data in the Royal Bank of Canada Learn to Play–Canadian Assessment of Physical Literacy (RBC Learn to Play–CAPL) project. Methods A total of 10,034 Canadian children aged 8 to 12 years participated in the RBC Learn to Play–CAPL project. A 32-variable subset from the larger CAPL dataset was used for these analyses. Several R packages (“Hmisc”, “mice”, “VIM”) were used to generate matrices and plots of missing data, and to perform multiple imputations. Results Overall, the proportion of missing data for individual measures and domains ranged from 0.0 to 33.8%, with the average proportion of missing data being 4.0%. The largest proportion of missing data in CAPL was the pedometer step counts, followed by the components of the Physical Competence domain and the Children’s Self-Perception of Adequacy in and Predilection for Physical Activity subscales. When domain scores were regressed on five imputed subsets with the original subset as the reference, there were small and statistically detectable differences in the Daily Behaviour score (β = − 1.6 to − 1.7, p < 0.001). However, for the other domain scores the differences were negligible and statistically undetectable (β = − 0.01 to − 0.06, p > 0.05). Conclusions This study has implications for other researchers or educators who are creating or using large field-based assessment measures in the areas of physical literacy, physical activity, or physical fitness, as this study demonstrates where problems in data collection can arise and how missing data can be avoided. When large proportions of missing data are present, imputation techniques, correction factors, or other treatment options may be required.