International Journal of Data and Network Science (Jan 2024)

Sensitivity analysis of the PC hyperprior for range and standard deviation components in Bayesian Spatiotemporal high-resolution prediction: An application to PM2.5 prediction in Jakarta, Indonesia

  • Tafia Hasna Putri,
  • I Gede Nyoman Mindra Jaya,
  • Toni Toharudin ,
  • Farah Kristiani

DOI
https://doi.org/10.5267/j.ijdns.2023.12.018
Journal volume & issue
Vol. 8, no. 2
pp. 871 – 880

Abstract

Read online

The Gaussian Markov Random Field (GMRF) is widely acknowledged for its remarkable flexibility, especially in the realm of high-resolution prediction, when compared to conventional Kriging methods. Rooted in the fundamental principles of Bayesian estimation, this methodology underscores the importance of a meticulous examination of prior and hyperprior distributions, along with their corresponding parameter values. Sensitivity analyses are crucial for evaluating the potential impact of these distributions and parameter values on prediction results. To determine the most effective values for hyperprior parameters, an iterative trial-and-error approach is commonly employed. In our research, we systematically assessed a variety of parameter values through exhaustive cross-validation. Our study is focused on optimizing hyperprior parameter values, with a particular emphasis on Penalized Complexity (PC). We applied our method to conduct spatiotemporal high-resolution predictions of PM2.5 concentrations in Jakarta province, Indonesia. Achieving accurate predictive modeling of PM2.5 concentrations in Jakarta is contingent upon this optimization. We identified that the optimal values for PC hyperprior parameters, with a range of less than 2,000 and a hyperprior standard deviation greater than 1 with a 0.1 probability, yield the most accurate predictions. These parameter values result in the minimum mean absolute percentage error (MAPE) of 19.35393, along with a deviation information criterion (DIC) of -154.23. Our findings highlight that the standard deviation parameter significantly influences model fit compared to the relatively insignificant impact of the range parameter. When coupled with high-resolution mapping, these optimized parameters facilitate a comprehensive understanding of distribution patterns. This process aids in detecting areas particularly susceptible to risks, thereby enhancing decision-making efficacy regarding air quality management.