PLoS ONE (Jan 2014)

Community-wide health risk assessment using geographically resolved demographic data: a synthetic population approach.

  • Jonathan I Levy,
  • Maria Patricia Fabian,
  • Junenette L Peters

DOI
https://doi.org/10.1371/journal.pone.0087144
Journal volume & issue
Vol. 9, no. 1
p. e87144

Abstract

Read online

BACKGROUND: Evaluating environmental health risks in communities requires models characterizing geographic and demographic patterns of exposure to multiple stressors. These exposure models can be constructed from multivariable regression analyses using individual-level predictors (microdata), but these microdata are not typically available with sufficient geographic resolution for community risk analyses given privacy concerns. METHODS: We developed synthetic geographically-resolved microdata for a low-income community (New Bedford, Massachusetts) facing multiple environmental stressors. We first applied probabilistic reweighting using simulated annealing to data from the 2006-2010 American Community Survey, combining 9,135 microdata samples from the New Bedford area with census tract-level constraints for individual and household characteristics. We then evaluated the synthetic microdata using goodness-of-fit tests and by examining spatial patterns of microdata fields not used as constraints. As a demonstration, we developed a multivariable regression model predicting smoking behavior as a function of individual-level microdata fields using New Bedford-specific data from the 2006-2010 Behavioral Risk Factor Surveillance System, linking this model with the synthetic microdata to predict demographic and geographic smoking patterns in New Bedford. RESULTS: Our simulation produced microdata representing all 94,944 individuals living in New Bedford in 2006-2010. Variables in the synthetic population matched the constraints well at the census tract level (e.g., ancestry, gender, age, education, household income) and reproduced the census-derived spatial patterns of non-constraint microdata. Smoking in New Bedford was significantly associated with numerous demographic variables found in the microdata, with estimated tract-level smoking rates varying from 20% (95% CI: 17%, 22%) to 37% (95% CI: 30%, 45%). CONCLUSIONS: We used simulation methods to create geographically-resolved individual-level microdata that can be used in community-wide exposure and risk assessment studies. This approach provides insights regarding community-scale exposure and vulnerability patterns, valuable in settings where policy can be informed by characterization of multi-stressor exposures and health risks at high resolution.