Abstract
This paper introduces a new geostatistical model for counting data under a space-time approach using nonhomogeneous Poisson processes, where the random intensity process has an additive formulation with two components: a Gaussian spatial component and a component accounting for the temporal effect. Inferences of interest for the proposed model are obtained under the Bayesian paradigm. To illustrate the usefulness of the proposed model, we first develop a simulation study to test the efficacy of the Markov Chain Monte Carlo (MCMC) method to generate samples for the joint posterior distribution of the model’s parameters. This study shows that the convergence of the MCMC algorithm used to simulate samples for the joint posterior distribution of interest is easily obtained for different scenarios. As a second illustration, the proposed model is applied to a real data set related to ozone air pollution collected in 22 monitoring stations in Mexico City in the 2010 year. The proposed geostatistical model has good performance in the data analysis, in terms of fit to the data and in the identification of the regions with the highest pollution levels, that is, the southwest, the central and the northwest regions of Mexico City.
Similar content being viewed by others
References
Achcar JA, Dey KD, Niverthi M (1998) A Bayesian approach using nonhomogeneous Poisson process for software reliability models. In: Basu AS, Basu SK, Mukhopadhyay S (eds) Frontiers in reliability, vol 4, 1st edn. World Scientific, River Edge, pp 1–18
Achcar JA, Fernández-Bremauntz AA, Rodrigues ER, Tzintzun G (2008) Estimating the number of ozone peaks in Mexico City using a non-homogeneous Poisson model. Environmetrics 19(5):469–485
Achcar JA, Rodrigues ER, Tzintzun G (2011a) Modelling inter-occurrence times between ozone peaks in Mexico City in the presence of multiple change-points. Braz J Probab Stat 25(2):183–204
Achcar JA, Rodrigues ER, Tzintzun G (2011b) Using non-homogeneous Poisson models with multiple change-points to estimate the number of ozone exceedances in Mexico City. Environmetrics 22:1–12
Air Resource Board (ARB) (2005) Review of air quality standard for ozone in California. Environmental Protection Agency, Staff Report, California, USA
Alonso JB, Achcar JA, Hotta LK (2010) Climate changes and their effects in public health: use of Poisson regression models. Pesqui Oper 30:427–442
Bartoletti S, Loperfido N (2010) Modelling air pollution data by the skew-normal distribution. Stoch Environ Res Risk Assess 24(4):513–517
Braga ALF, Zanobetti A, Schwartz J (2002) The effect of weather on respiratory and cardiovascular deaths in 12 U.S. Cities. Environ Health Perspect 110:859–863
Chien LC, Bangdiwala SI (2012) The implementation of Bayesian structural additive regression models in multi-city time series air pollution and human health studies. Stoch Environ Res Risk Assess 26(8):1041–1051
Cox DR (1955) Some statistical methods connected with series of events. J R Stat Soc Ser B 17:129–164
Cressie NAC (1993) Statistics for spatial data, Revised edn. Wiley, New York
Daley DJ, Vere-Jones D (1988) An introduction to the theory of point processes: volum I: elementary theory and methods, vol 1, 2nd edn. Springer, New York
Diggle PJ, Moraga P, Rowlingson B, Taylor BM (2013) Spatial and spatio-temporal log-Gaussian Cox processes: extending the geostatistical paradigm. Stat Sci 28(4):542–563
Fishman PM, Snyder DL (1976) The statistical analysis of space-time point processes. IEEE Inf Theory Soc 22(3):257–274
Gelman A, Rubin DB (1992) Inference from iterative simulation using multiple sequences. Stat Sci 7:434–455
Goel AL (1983) A guidebook for software reliability assessment. Technical Report
Gouveia N, Freitas CU, Martins LC, Marcilio IO (2006) Hospitalizações por causas respiratórias associadas à contaminação atmosférica no município de São Paulo, Brasil. Cadernos Saúde Pública 22:2669–2677
Hastings WK (1970) Monte Carlo sampling methods using Markov chains and their applications. Biometrika 57:97–109
He C, Huang Z, Ye X (2014) Spatial heterogeneity of economic development and industrial pollution in urban China. Stoch Environ Res Risk Assess 28(4):767–781
Karr A (1991) Point processes and their statistical inference. 2nd edn. Dekker, New York, USA
Kuo L, Yang T (1996) Bayesian computation for nonhomogeneous Poisson processes in software reliability. J Am Stat Assoc 91:763–773
Lawson AB (2008) Bayesian disease mapping: hierarchical modeling in spatial epidemiology. Chapman and Hall, London, UK
Metropolis N, Rosenbluth AW, Rosenbluth MN, Teller AH, Teller E (1953) Equations of state calculations by fast computing machines. J Chem Phys 21:1087–1092
Muñoz E, Martin ML, Turias IJ, Jimenez-Come MJ, Trujillo FJ (2014) Prediction of PM10 and SO2 exceedances to control air pollution in the Bay of Algeciras, Spain. Stoch Environ Res Risk Assess 28(6):1409–1420
Nadarajah S (2008) A truncated inverted beta distribution with application to air pollution data. Stoch Environ Res Risk Assess 22(2):285–289
Ribeiro MC, Pinho P, Lop E, Branquinho C, Soares A, Pereira MJ (2014) Associations between outdoor air quality and birth weight: a geostatistical sequential simulation approach in Coastal Alentejo, Portugal. Stoch Environ Res Risk Assess 28(3):527–540
Rodrigues ER, Achcar JA (2012) Applications of discrete-time Markov chains and Poisson processes to air pollution modeling and studies, vol 1. Springer, New York, p 107
Rodrigues ER, Gamerman D, Tarumoto MH, Tarumoto G (2015) A non-homogeneous Poisson model with spatial anisotropy applied to ozone data from Mexico City. Environ Ecol Stat 22(2):393–422
Schmidt AM, Gelfand A (2003) A Bayesian coregionalization approach for multivariate pollutant data. J Geophys Res 108(D24). http://www.agu.org/pubs/crossref/2003/2002JD002905.shtml
Schmidt AM, Conceição FG, Moreira GA (2008) Investigating the sensitivity of Gaussian processes to the choice of their correlation function and prior specifications. J Stat Comput Simul 78(8):681–699
Schoenberg F (1999) Transforming spatial point processes into Poisson processes. Stoch Process Appl 81:155–164
Snyder DL, Miller MI (1991) Random point processes in time and space. Wiley, New York
Vere-Jones D, Thomson PJ (1984) Some aspects of space-time modelling. In: Proceedings of twelfth international biometrics conference, Tokyo, pp 265-275
Vicini L, Hotta LK, Achcar JA (2013) Non-homogeneous Poisson process in the presence of one or more change-points: an application to air pollution data. J Environ Stat 5:1–22
Wang Z, Bovik AC, Sheikh HR, Simoncelli EP (2004) Image quality assessment: from error visibility to structural similarity. IEEE Trans Image Process 13(4):600–612
Yu B, Huang C, Liu Z, Wang H, Wang L (2011) A chaotic analysis on air pollution index change over past 10 years in Lanzhou, northwest China. Stoch Environ Res Risk Assess 25(5):643–653
Acknowledgments
The authors are very grateful to the Editor and referees for their helpful and useful comments that improved the manuscript. The third author acknowledges support by São Paulo Research Foundation (FAPESP), Grant 2009/15098-0. The first and fourth authors were partially supported by CNPq-Brazil. The second and third authors acknowledge Laboratório Epifisma, Unicamp.
Author information
Authors and Affiliations
Corresponding author
Appendix: Prior and posterior distributions and interpolation
Appendix: Prior and posterior distributions and interpolation
1.1 Prior distributions for the parameters \(\beta\), \(\alpha\), \(\mathbf \Psi ,\sigma ^2\) and \(\phi\)
The Bayesian approach for the model defined by Eq. (1) to (7) must be complemented with the specification for the prior distributions of the parameters. In the same way as considered by Achcar et al. (1998), we consider the introduction of a latent variable \(N_j'=N_j-n_j\). The introduction of such latent variable was proposed by Kuo and Yang (1996) to facilitate sampling from the joint posterior distributions. Similarly to the original paper, as shown in Sect. 1, the posterior distribution of \(( \mathbf {N'},\Phi \mid D)\) is given directly, and the conditional distribution \((\Phi \mid D)\) is given by the marginal distribution. We assume the following prior distributions for \(N_j',\beta ,\alpha ,\) \(\mathbf \Psi ,\sigma ^2\) and \(\phi\):
-
\(N'_j\sim Pois\left[ e^{W_j -\beta T_j^{\alpha }}\right]\),
-
\(\beta \sim G(a_{\beta },b_{\beta }),\) in which \(a_{\beta }\), \(b_{\beta }\) are known,
-
\(\alpha \sim G(a_{\alpha },b_{\alpha }),\) in which \(a_{\alpha }\), \(b_{\alpha }\) are known,
-
\({\varvec {\Psi}} \sim N({\mathbf {m}},{\mathbf {v}}),\) in which \(\mathbf {m}\), \(\mathbf {v}\) are known,
-
\(\sigma ^2 \sim G(a_{\sigma ^2},b_{\sigma ^2}),\) in which \(a_{\sigma ^2}\), \(b_{\sigma ^2}\) are known,
-
\(\phi \sim G(a_{\phi }*\eta ,\eta ),\) in which \(a_{\phi }=-2\log (0.05)/max(\mid \mathbf {s}_i-\mathbf {s}_j\mid )\), a prior distribution proposed by Schmidt and Gelfand (2003),
where \(Pois(\lambda )\) denotes a Poisson distribution with parameter \(\lambda\), G(a, b) denotes a gamma distribution with mean a / b and variance \(a/b^2\); and \(N(\mathbf {\mu },\mathbf {A})\) denotes a normal multivariate distribution with mean vector \(\mathbf {\mu }\) and covariance matrix \(\mathbf {A}\). Thus, by the definition of a Gaussian process, \(\mathbf {W}\sim N(\mathbf {X}\mathbf \Psi ,\mathbf \Sigma )\), where \(\mathbf {X}\) is an observed matrix of covariates and \(\mathbf \Sigma\) is a covariance matrix in which its elements are given by \(\Sigma _{ij}=\sigma ^2\rho _{\phi }(\mathbf {s}_i,\mathbf {s}_j),\;i,j=1,\ldots ,n.\) Thus, considering independence among the prior distributions of the other parameters, the prior distribution for \(\mathbf \Phi\) is given by
1.2 Posterior distribution
Given the prior distribution (8), and combining this information with the likelihood function given in (7), the joint posterior distribution for \(\mathbf \Phi\) and \(N_j,j=1,\ldots ,n\) is given by
where \(\mathbf {N'} = (N_1, \ldots , N_n)\). To overcome the difficulty of generating samples of the joint posterior distribution (9), we use MCMC simulation methods to obtain the posterior quantities of interest of this model.
1.3 Conditional distributions required for the simulation algorithm
The algorithm used to generate the posterior distribution samples in (9) is a Gibbs sampling algorithm with Metropolis–Hastings steps (Metropolis et al. 1953; Hastings 1970). The conditional posterior densities for the parameters required in the Metropolis–Hastings steps are given by
where \(\Theta _{-\theta }\) denotes the set of parameters \(\Theta\) excluding the parameter \(\theta\).
1.4 Interpolation
The main objective of geostatistics is to make predictions or interpolation of a process under study at any location from the data observed only in a fixed finite number of locations. In this paper, the main goal is to estimate the intensity function of the process at any point in the region of interest A at a time t. Before interpolating the intensity function \(\lambda\), it is necessary to interpolate the W’s values. The interpolations of W in the space A and of \(\lambda (\mathbf {s},t)\) are done as follows:
1.4.1 Interpolation in \(\mathbf {W}\)
Let \(\mathbf {W}^{NM}=(W_{n+1},\ldots ,W_{n+m})\) be a vector of values of the function \(W(\cdot )\), corresponding to the geographic points \(\mathbf{s_{n+1},\ldots ,s_{n+m} }\in A\) where the process is not observed. By definition of a Gaussian process, we have
where \(\mathbf { A_1}=\mathbf { X_{A_1}} \mathbf {\Psi },\) \(\mathbf { A_2}=\mathbf { X_{A_2}}\mathbf { \Psi }\), , in which \(\mathbf { X_{A_1}}\) and \(\mathbf { X_{A_2}}\) are the covariates matrices associated with \(\mathbf {W}\) and \(\mathbf { W^{NM}}\) respectively; \(\mathbf {\Sigma _{A_1}}\) is the matrix the covariance of \(\mathbf { W}\), \(\mathbf {\Sigma _{A_2}}\) is the matrix of the covariance of \(\mathbf { W^{NM}}\) and \(\mathbf {\Sigma _{A_{12}}}\) is the matrix of the covariance between \(\mathbf { W}\) and \(\mathbf { W^{NM}}\). By the properties of the normal multivariate distribution we have
in which \(\mathbf { A_2^*}=\mathbf { A_2} +\mathbf {\Sigma _{A_{12}}'} \mathbf {\Sigma _{A_{1}}^{-1}}(\mathbf { W}-\mathbf { A_1})\) and \(\mathbf {\Sigma _{A_{2}}^*}=\mathbf {\Sigma _{A_{2}}}-\mathbf {\Sigma _{A_{12}}'} \mathbf {\Sigma _{A_{1}}^{-1}}\mathbf {\Sigma _{A_{12}}}\).
Samples of the posterior distribution of \(\mathbf { W}^{NM}\) are generated from
where \(\mathbf { {A_2^*}}^{(r)}=\mathbf {{A_2}}^{(r)} +\mathbf {\Sigma _{A_{12}}'}^{(r)} \mathbf {\Sigma _{A_{1}}^{-1}}^{(r)}(\mathbf { W}^{(r)}-\mathbf { {A_1}}^{(r)})\) and \(\mathbf {\Sigma _{A_{2}}^*}^{(r)}=\mathbf {\Sigma _{A_{2}}}^{(r)}-\mathbf {\Sigma _{A_{12}}'}^{(r)} \mathbf {\Sigma _{A_{1}}^{-1}}^{(r)}\mathbf {\Sigma _{A_{12}}}^{(r)}\), with \(\mathbf {A_1}^{(r)}=\mathbf { X}_{A_1}\mathbf {\Psi }^{(r)},\) \(\mathbf { A_2}=\mathbf { X}_{A_2}\mathbf { \Psi }^{(r)}\). The elements of \(\mathbf {\Sigma _{A_{1}}}^{(r)},\) \(\mathbf {\Sigma _{A_{2}}}^{(r)}\) and \(\mathbf {\Sigma _{A_{12}}'}^{(r)}\) are calculated from \({\Sigma ^{(r)}}_{jl}={\sigma ^2}^{(r)}\exp [-\phi ^{(r)}\mid s_j-s_l\mid ]\), \(j,l=1,\ldots ,n+m\), where \(\mathbf { W}^{(r)},\mathbf {\Psi }^{(r)},\;{\sigma ^2}^{(r)},\;\phi ^{(r)}\) (\(r=1,\ldots ,M\)), are the samples of the posterior distribution obtained using the MCMC algorithm.
1.4.2 Interpolation in \(\lambda\)
The posterior mean of \(\lambda (t,\mathbf {s}_l)\) is estimated by
Rights and permissions
About this article
Cite this article
Morales, F.E.C., Vicini, L., Hotta, L.K. et al. A nonhomogeneous Poisson process geostatistical model. Stoch Environ Res Risk Assess 31, 493–507 (2017). https://doi.org/10.1007/s00477-016-1275-x
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00477-016-1275-x