Abstract
An unsupervised classification method for point events occurring on a geometric network is proposed. The idea relies on the distributional flexibility and practicality of random partition models to discover the clustering structure featuring observations from a particular phenomenon taking place on a given set of edges. By incorporating the spatial effect in the random partition distribution, induced by a Dirichlet process, one is able to control the distance between edges and events, thus leading to an appealing clustering method. A Gibbs sampler algorithm is proposed and evaluated with a sensitivity analysis. The proposal is motivated and illustrated by the analysis of crime and violence patterns in Mexico City.
Similar content being viewed by others
References
Abolhassani A, Prates MO (2021) An up-to-date review of scan statistics. Stat Surv 15:111–153
Ang QW, Baddeley A, Nair G (2012) Geometrically corrected second order analysis of events on a linear network, with applications to ecology and criminology. Scand J Stat 39(4):591–617
Antoniak CE (1974) Mixtures of Dirichlet processes with applications to Bayesian nonparametric problems. Ann Stat 2(6):1152–1174
Assunção R, Maia A (2007) A note on testing separability in spatial-temporal marked point processes. Biometrics 63(1):290–294
Baddeley A, Rubak E, Turner R (2016) Spatial point patterns: methodology and applications with R. Chapman and Hall/CRC
Baddeley A, Nair G, Rakshit S et al (2017) “Stationary’’ point processes are uncommon on linear networks. Stat 6(1):68–78
Blei DM, Frazier PI (2011) Distance dependent Chinese restaurant processes. J Mach Learn Res 12(74):2461–2488
D’Angelo N, Adelfio G, Mateu J (2022) Local inhomogeneous second-order characteristics for spatio-temporal point processes occurring on linear networks. Stat Pap 66:1613–9798
De Blasi P, Martínez AF, Mena RH et al (2020) On the inferential implications of decreasing weight structures in mixture models. Comput Stat Data Anal 147(106):940
Duan JA, Guindani M, Gelfand AE (2007) Generalized spatial Dirichlet process models. Biometrika 94(4):809–825
Escobar MD, West M (1995) Bayesian density estimation and inference using mixtures. J Am Stat Assoc 90(430):577–588
Ewens WJ (1972) The sampling theory of selectively neutral alleles. Theor Popul Biol 3(1):87–112
Favaro S, Lijoi A, Nava C et al (2016) On the stick-breaking representation for homogeneous NRMIs. Bayesian Anal 11(3):697–724
Flajolet P, Sedgewick R (2009) Analytic combinatorics. Cambridge University Press, Cambridge
Fuentes-García R, Mena RH, Walker SG (2010) A new Bayesian nonparametric mixture model. Commun Stat Simul Comput 39(4):669–682
Fuentes-García R, Mena RH, Walker SG (2019) Modal posterior clustering motivated by Hopfield’s network. Comput Stat Data Anal 137:92–100
Gil Leyva Villa M, Mena RH (2021) Stick-breaking processes with exchangeable length variables. J Am Stat Assoc 66:1–14
Gil Leyva Villa M, Mena RH, Nicoleris T (2020) Beta-binomial stick-breaking non-parametric prior. Electron J Stat 14:1479–1507
Gilardi A, Borgoni R, Mateu J (2021) A non-separable first-order spatio-temporal intensity for events on linear networks: an application to ambulance interventions. arXiv:2106.00457
Hartigan JA (1990) Partition models. Commun Stat Theory Methods 19(8):2745–2756
Hjort NL, Holmes C, Müller P et al (eds) (2010) Bayesian nonparametrics. Cambridge series in statistical and probabilistic mathematics. Cambridge University Press
Jiménez Ornelas RA (2003) La cifra negra de la delincuencia en México: sistema de encuestas sobre victimización. In: Vargas Casillas LA, García Ramírez S (eds) Proyectos legislativos y otros temas penales. Universidad Nacional Autónoma de México. Instituto de Investigaciones Jurídicas, pp 167–190
Jo S, Lee J, Müller P et al (2017) Dependent species sampling models for spatial density estimation. Bayesian Anal 12(2):379–406
MacEachern SN (1999) Dependent nonparametric processes. In: ASA proceedings of the section on Bayesian statistical science. American Statistical Association, pp 50–55
MacEachern SN (2000) Dependent Dirichlet processes. Tech. rep., Department of Statistics, Ohio State University
Mateu J, Moradi M, Cronie O (2020) Spatio-temporal point patterns on linear networks: pseudo-separable intensity estimation. Spat Stat 37(100):400
McSwiggan G, Baddeley A, Nair G (2017) Kernel density estimation on a linear network. Scand J Stat 44(2):324–345
Mendieta Ramírez A (2019) Violencia y delincuencia en México: el uso político del miedo. EUNOMÍA Revista en Cultura de la Legalidad 17:182–206
Miller JW (2019) An elementary derivation of the Chinese restaurant process from Sethuraman’s stick-breaking process. Stat Probab Lett 146:112–117
Müller P, Quintana FA, Rosner GL (2011) A product partition model with regression on covariates. J Comput Graph Stat 20(1):260–278
Okabe A, Yamada I (2001) The K-function method on a network and its computational implementation. Geograph Anal 33(3):271–290
Page GL, Quintana FA (2016) Spatial product partition models. Bayesian Anal 11(1):265–298
Pansters W, Castillo Berthier H (2007) Violencia e inseguridad en la Ciudad de México: entre la fragmentación y la politización. Foro Internacional 48(3):577–615
Perman M, Pitman J, Yor M (1992) Size-biased sampling of Poisson point processes and excursions. Probab Theory Rel Fields 92:21–39
Piña García CA, Ramírez-Ramírez L (2019) Exploring crime patterns in Mexico City. J Big Data 6:65
Reich BJ, Fuentes M (2007) A multivariate semiparametric Bayesian spatial modeling framework for hurricane surface wind fields. Ann Appl Stat 1(1):249–264
Sethuraman J (1994) A constructive definition of Dirichlet priors. Stat Sin 4(2):639–650
Shiode S, Shiode N (2020) A network-based scan statistic for detecting the exact location and extent of hotspots along urban streets. Comput Environ Urban Syst 83(101):500
Valenzuela Aguilera A (2020) The spatial dimension of crime in México City (2016–2019). Tech. rep., Rice University’s Baker Institute for Public Policy
Yamada I, Thill JC (2004) Comparison of planar and network K-functions in traffic accident analysis. J Transp Geogr 12(2):149–158
Funding
A.F. Martínez, C. Díaz-Avalos and R.H. Mena thankfully acknowledge the financial support of PAPIIT project number IG100221. J. Mateu was partially supported by project PID2019-107392RB-I00/AEI/10.13039/501100011033.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The authors have not disclosed any competing interests.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary Information
Below is the link to the electronic supplementary material.
Supplementary material
For the small synthetic dataset, several sampling specifications were run; similarly, for the real data application, different prior specifications were used. Posterior estimates are presented for all these cases. (pdf 4452 KB)
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Martínez, A.F., Chaudhuri, S., Díaz-Avalos, C. et al. Clustering constrained on linear networks. Stoch Environ Res Risk Assess 37, 1983–1995 (2023). https://doi.org/10.1007/s00477-022-02376-y
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00477-022-02376-y