Skip to main content
Log in

Clustering constrained on linear networks

  • Original paper
  • Published:
Stochastic Environmental Research and Risk Assessment Aims and scope Submit manuscript

Abstract

An unsupervised classification method for point events occurring on a geometric network is proposed. The idea relies on the distributional flexibility and practicality of random partition models to discover the clustering structure featuring observations from a particular phenomenon taking place on a given set of edges. By incorporating the spatial effect in the random partition distribution, induced by a Dirichlet process, one is able to control the distance between edges and events, thus leading to an appealing clustering method. A Gibbs sampler algorithm is proposed and evaluated with a sensitivity analysis. The proposal is motivated and illustrated by the analysis of crime and violence patterns in Mexico City.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10

Similar content being viewed by others

Notes

  1. https://datos.cdmx.gob.mx/dataset/carpetas-de-investigacion-fgj-de-la-ciudad-de-mexico.

References

  • Abolhassani A, Prates MO (2021) An up-to-date review of scan statistics. Stat Surv 15:111–153

    Article  Google Scholar 

  • Ang QW, Baddeley A, Nair G (2012) Geometrically corrected second order analysis of events on a linear network, with applications to ecology and criminology. Scand J Stat 39(4):591–617

    Article  Google Scholar 

  • Antoniak CE (1974) Mixtures of Dirichlet processes with applications to Bayesian nonparametric problems. Ann Stat 2(6):1152–1174

    Article  Google Scholar 

  • Assunção R, Maia A (2007) A note on testing separability in spatial-temporal marked point processes. Biometrics 63(1):290–294

    Article  Google Scholar 

  • Baddeley A, Rubak E, Turner R (2016) Spatial point patterns: methodology and applications with R. Chapman and Hall/CRC

  • Baddeley A, Nair G, Rakshit S et al (2017) “Stationary’’ point processes are uncommon on linear networks. Stat 6(1):68–78

    Article  Google Scholar 

  • Blei DM, Frazier PI (2011) Distance dependent Chinese restaurant processes. J Mach Learn Res 12(74):2461–2488

    Google Scholar 

  • D’Angelo N, Adelfio G, Mateu J (2022) Local inhomogeneous second-order characteristics for spatio-temporal point processes occurring on linear networks. Stat Pap 66:1613–9798

    Google Scholar 

  • De Blasi P, Martínez AF, Mena RH et al (2020) On the inferential implications of decreasing weight structures in mixture models. Comput Stat Data Anal 147(106):940

    Google Scholar 

  • Duan JA, Guindani M, Gelfand AE (2007) Generalized spatial Dirichlet process models. Biometrika 94(4):809–825

    Article  Google Scholar 

  • Escobar MD, West M (1995) Bayesian density estimation and inference using mixtures. J Am Stat Assoc 90(430):577–588

    Article  Google Scholar 

  • Ewens WJ (1972) The sampling theory of selectively neutral alleles. Theor Popul Biol 3(1):87–112

    Article  CAS  Google Scholar 

  • Favaro S, Lijoi A, Nava C et al (2016) On the stick-breaking representation for homogeneous NRMIs. Bayesian Anal 11(3):697–724

    Article  Google Scholar 

  • Flajolet P, Sedgewick R (2009) Analytic combinatorics. Cambridge University Press, Cambridge

    Book  Google Scholar 

  • Fuentes-García R, Mena RH, Walker SG (2010) A new Bayesian nonparametric mixture model. Commun Stat Simul Comput 39(4):669–682

    Article  Google Scholar 

  • Fuentes-García R, Mena RH, Walker SG (2019) Modal posterior clustering motivated by Hopfield’s network. Comput Stat Data Anal 137:92–100

    Article  Google Scholar 

  • Gil Leyva Villa M, Mena RH (2021) Stick-breaking processes with exchangeable length variables. J Am Stat Assoc 66:1–14

    Google Scholar 

  • Gil Leyva Villa M, Mena RH, Nicoleris T (2020) Beta-binomial stick-breaking non-parametric prior. Electron J Stat 14:1479–1507

    Google Scholar 

  • Gilardi A, Borgoni R, Mateu J (2021) A non-separable first-order spatio-temporal intensity for events on linear networks: an application to ambulance interventions. arXiv:2106.00457

  • Hartigan JA (1990) Partition models. Commun Stat Theory Methods 19(8):2745–2756

    Article  Google Scholar 

  • Hjort NL, Holmes C, Müller P et al (eds) (2010) Bayesian nonparametrics. Cambridge series in statistical and probabilistic mathematics. Cambridge University Press

  • Jiménez Ornelas RA (2003) La cifra negra de la delincuencia en México: sistema de encuestas sobre victimización. In: Vargas Casillas LA, García Ramírez S (eds) Proyectos legislativos y otros temas penales. Universidad Nacional Autónoma de México. Instituto de Investigaciones Jurídicas, pp 167–190

  • Jo S, Lee J, Müller P et al (2017) Dependent species sampling models for spatial density estimation. Bayesian Anal 12(2):379–406

    Article  Google Scholar 

  • MacEachern SN (1999) Dependent nonparametric processes. In: ASA proceedings of the section on Bayesian statistical science. American Statistical Association, pp 50–55

  • MacEachern SN (2000) Dependent Dirichlet processes. Tech. rep., Department of Statistics, Ohio State University

  • Mateu J, Moradi M, Cronie O (2020) Spatio-temporal point patterns on linear networks: pseudo-separable intensity estimation. Spat Stat 37(100):400

    Google Scholar 

  • McSwiggan G, Baddeley A, Nair G (2017) Kernel density estimation on a linear network. Scand J Stat 44(2):324–345

    Article  Google Scholar 

  • Mendieta Ramírez A (2019) Violencia y delincuencia en México: el uso político del miedo. EUNOMÍA Revista en Cultura de la Legalidad 17:182–206

    Article  Google Scholar 

  • Miller JW (2019) An elementary derivation of the Chinese restaurant process from Sethuraman’s stick-breaking process. Stat Probab Lett 146:112–117

    Article  Google Scholar 

  • Müller P, Quintana FA, Rosner GL (2011) A product partition model with regression on covariates. J Comput Graph Stat 20(1):260–278

    Article  Google Scholar 

  • Okabe A, Yamada I (2001) The K-function method on a network and its computational implementation. Geograph Anal 33(3):271–290

    Article  Google Scholar 

  • Page GL, Quintana FA (2016) Spatial product partition models. Bayesian Anal 11(1):265–298

    Article  Google Scholar 

  • Pansters W, Castillo Berthier H (2007) Violencia e inseguridad en la Ciudad de México: entre la fragmentación y la politización. Foro Internacional 48(3):577–615

    Google Scholar 

  • Perman M, Pitman J, Yor M (1992) Size-biased sampling of Poisson point processes and excursions. Probab Theory Rel Fields 92:21–39

    Article  Google Scholar 

  • Piña García CA, Ramírez-Ramírez L (2019) Exploring crime patterns in Mexico City. J Big Data 6:65

    Article  Google Scholar 

  • Reich BJ, Fuentes M (2007) A multivariate semiparametric Bayesian spatial modeling framework for hurricane surface wind fields. Ann Appl Stat 1(1):249–264

    Article  Google Scholar 

  • Sethuraman J (1994) A constructive definition of Dirichlet priors. Stat Sin 4(2):639–650

    Google Scholar 

  • Shiode S, Shiode N (2020) A network-based scan statistic for detecting the exact location and extent of hotspots along urban streets. Comput Environ Urban Syst 83(101):500

    Google Scholar 

  • Valenzuela Aguilera A (2020) The spatial dimension of crime in México City (2016–2019). Tech. rep., Rice University’s Baker Institute for Public Policy

  • Yamada I, Thill JC (2004) Comparison of planar and network K-functions in traffic accident analysis. J Transp Geogr 12(2):149–158

    Article  Google Scholar 

Download references

Funding

A.F. Martínez, C. Díaz-Avalos and R.H. Mena thankfully acknowledge the financial support of PAPIIT project number IG100221. J. Mateu was partially supported by project PID2019-107392RB-I00/AEI/10.13039/501100011033.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Asael Fabian Martínez.

Ethics declarations

Conflict of interest

The authors have not disclosed any competing interests.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Below is the link to the electronic supplementary material.

Supplementary material

For the small synthetic dataset, several sampling specifications were run; similarly, for the real data application, different prior specifications were used. Posterior estimates are presented for all these cases. (pdf 4452 KB)

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Martínez, A.F., Chaudhuri, S., Díaz-Avalos, C. et al. Clustering constrained on linear networks. Stoch Environ Res Risk Assess 37, 1983–1995 (2023). https://doi.org/10.1007/s00477-022-02376-y

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00477-022-02376-y

Keywords

Navigation