International Journal of Health Geographics (Nov 2005)

Geographic bias related to geocoding in epidemiologic studies

  • Siadaty Mir,
  • Matthews Kevin A,
  • Oliver M Norman,
  • Hauck Fern R,
  • Pickle Linda W

DOI
https://doi.org/10.1186/1476-072X-4-29
Journal volume & issue
Vol. 4, no. 1
p. 29

Abstract

Read online

Abstract Background This article describes geographic bias in GIS analyses with unrepresentative data owing to missing geocodes, using as an example a spatial analysis of prostate cancer incidence among whites and African Americans in Virginia, 1990–1999. Statistical tests for clustering were performed and such clusters mapped. The patterns of missing census tract identifiers for the cases were examined by generalized linear regression models. Results The county of residency for all cases was known, and 26,338 (74%) of these cases were geocoded successfully to census tracts. Cluster maps showed patterns that appeared markedly different, depending upon whether one used all cases or those geocoded to the census tract. Multivariate regression analysis showed that, in the most rural counties (where the missing data were concentrated), the percent of a county's population over age 64 and with less than a high school education were both independently associated with a higher percent of missing geocodes. Conclusion We found statistically significant pattern differences resulting from spatially non-random differences in geocoding completeness across Virginia. Appropriate interpretation of maps, therefore, requires an understanding of this phenomenon, which we call "cartographic confounding."

Keywords