Wellcome Open Research (Sep 2020)

Approaches to minimising the epidemiological impact of sources of systematic and random variation that may affect biochemistry assay data in UK Biobank [version 1; peer review: 2 approved]

  • Naomi E. Allen,
  • Matthew Arnold,
  • Sarah Parish,
  • Michael Hill,
  • Simon Sheard,
  • Howard Callen,
  • Daniel Fry,
  • Stewart Moffat,
  • Mark Gordon,
  • Samantha Welsh,
  • Paul Elliott,
  • Rory Collins

DOI
https://doi.org/10.12688/wellcomeopenres.16171.1
Journal volume & issue
Vol. 5

Abstract

Read online

Background: UK Biobank is a large prospective study that recruited 500,000 participants aged 40 to 69 years, between 2006-2010.The study has collected (and continues to collect) extensive phenotypic and genomic data about its participants. In order to enhance further the value of the UK Biobank resource, a wide range of biochemistry markers were measured in all participants with an available biological sample. Here, we describe the approaches UK Biobank has taken to minimise error related to sample collection, processing, retrieval and assay measurement. Methods: During routine quality control checks, the laboratory team observed that some assay results were lower than expected for samples acquired during certain time periods. Analyses were undertaken to identify and correct for the unexpected dilution identified during sample processing, and for expected error caused by laboratory drift of assay results. Results: The vast majority (92%) of biochemistry serum assay results were assessed to be not materially affected by dilution, with an estimated difference in concentration of less than 1% (i.e. either lower or higher) than that expected if the sample were unaffected; 8.3% were estimated to be diluted by up to 10%; very few samples appeared to be diluted more than this. Biomarkers measured in urine (creatinine, microalbumin, sodium, potassium) and red blood cells (HbA1c) were not affected. In order to correct for laboratory variation over the assay period, all assay results were adjusted for date of assay, with the exception of those that had a high biological coefficient of variation or evident seasonal variability: vitamin D, lipoprotein (a), gamma glutamyltransferase, C-reactive protein and rheumatoid factor. Conclusions: Rigorous approaches related to sample collection, processing, retrieval, assay measurement and data analysis have been taken to mitigate the impact of both systematic and random variation in epidemiological analyses that use the biochemistry assay data in UK Biobank.