EClinicalMedicine (Apr 2022)

Development and external validation of a clinical prediction model to aid coeliac disease diagnosis in primary care: An observational study

  • Martha M.C. Elwenspoek, PhD,
  • Rachel O'Donnell, MSc,
  • Joni Jackson, MSc,
  • Hazel Everitt, PhD,
  • Peter Gillett, MBChB,
  • Alastair D. Hay, FRCGP,
  • Hayley E. Jones, PhD,
  • Gerry Robins, MD,
  • Jessica C. Watson, PhD,
  • Sue Mallett, DPhil,
  • Penny Whiting, PhD

Journal volume & issue
Vol. 46
p. 101376

Abstract

Read online

Summary: Background: Coeliac disease (CD) affects approximately 1% of the population, although only a fraction of patients are diagnosed. Our objective was to develop diagnostic prediction models to help decide who should be offered testing for CD in primary care. Methods: Logistic regression models were developed in Clinical Practice Research Datalink (CPRD) GOLD (between Sep 9, 1987 and Apr 4, 2021, n=107,075) and externally validated in CPRD Aurum (between Jan 1, 1995 and Jan 15, 2021, n=227,915), two UK primary care databases, using (and controlling for) 1:4 nested case-control designs. Candidate predictors included symptoms and chronic conditions identified in current guidelines and using a systematic review of the literature. We used elastic-net regression to further refine the models. Findings: The prediction model included 24, 24, and 21 predictors for children, women, and men, respectively. For children, the strongest predictors were type 1 diabetes, Turner syndrome, IgA deficiency, or first-degree relatives with CD. For women and men, these were anaemia and first-degree relatives. In the development dataset, the models showed good discrimination with a c-statistic of 0·84 (95% CI 0·83–0·84) in children, 0·77 (0·77–0·78) in women, and 0·81 (0·81–0·82) in men. External validation discrimination was lower, potentially because ‘first-degree relative’ was not recorded in the dataset used for validation. Model calibration was poor, tending to overestimate CD risk in all three groups in both datasets. Interpretation: These prediction models could help identify individuals with an increased risk of CD in relatively low prevalence populations such as primary care. Offering a serological test to these patients could increase case finding for CD. However, this involves offering tests to more people than is currently done. Further work is needed in prospective cohorts to refine and confirm the models and assess clinical and cost effectiveness. Funding: National Institute for Health Research Health Technology Assessment Programme (grant number NIHR129020)

Keywords