Scientific Reports (Oct 2023)
Development and validation of a prediction model for iron status in a large U.S. cohort of women
Abstract
Abstract Serum iron levels can be important contributors to health outcomes, but it is not often feasible to rely on blood-based measures for a large epidemiologic study. Predictive models that use questionnaire-based factors such as diet, supplement use, recency of blood donation, and medical conditions could potentially provide a noninvasive alternative for studying health effects associated with iron status. We hypothesized that a model based on questionnaire data could predict blood-based measures of iron status biomarkers. Using iron (mcg/dL), ferritin (mcg/dL), and transferrin saturation (%) based on blood collected at study entry, in a subsample from the U.S.-wide Sister Study (n = 3171), we developed and validated a prediction model for iron with multivariable linear regression models. Model performance based on these cross-sectional data was weak, with R2 less than 0.10 for serum iron and transferrin saturation, but better for ferritin, with an R2 of 0.13 in premenopausal women and 0.19 in postmenopausal women. When menopause was included in the predictive model for the sample, the R2 was 0.31 for ferritin. Internal validation of the estimates indicated some optimism present in the observed prediction model, implying there would be worse performance when applied to new samples from the same population. Serum iron status is hard to assess based only on questionnaire data. Reducing measurement error in both the exposure and outcome may improve the prediction model performance, but environmental heterogeneity, temporal variation, and genetic heterogeneity in absorption and storage may contribute substantially to iron status.