Royal Society Open Science (Jun 2024)

Bayesian cluster geographically weighted regression for spatial heterogeneous data

  • Wala Draidi Areed,
  • Aiden Price,
  • Helen Thompson,
  • Conor Hassan,
  • Reid Malseed,
  • Kerrie Mengersen

DOI
https://doi.org/10.1098/rsos.231780
Journal volume & issue
Vol. 11, no. 6

Abstract

Read online

Spatial statistical models are commonly used in geographical scenarios to ensure spatial variation is captured effectively. However, spatial models and cluster algorithms can be complicated and expensive. One of these algorithms is geographically weighted regression (GWR) which was proposed in the geography literature to allow relationships in a regression model to vary over space. In contrast to traditional linear regression models, which have constant regression coefficients over space, regression coefficients are estimated locally at spatially referenced data points with GWR. The motivation for the adaption of GWR is the idea that a set of constant regression coefficients cannot adequately capture spatially varying relationships between covariates and an outcome variable. GWR has been applied widely in diverse fields, such as ecology, forestry, epidemiology, neurology and astronomy. While frequentist GWR gives us point estimates and confidence intervals, Bayesian GWR enriches our understanding by including prior knowledge and providing probability distributions for parameters and predictions of interest. This paper pursues three main objectives. First, it introduces covariate effect clustering by integrating a Bayesian geographically weighted regression (BGWR) with a post-processing step that includes Gaussian mixture model and the Dirichlet process mixture model. Second, this paper examines situations in which a particular covariate holds significant importance in one region but not in another in the Bayesian framework. Lastly, it addresses computational challenges in existing BGWR, leading to enhancements in Markov chain Monte Carlo estimation suitable for large spatial datasets. The efficacy of the proposed method is demonstrated using simulated data and is further validated in a case study examining children’s development domains in Queensland, Australia, using data provided by Children’s Health Queensland and Australia’s Early Development Census.

Keywords