Skip to main content
Log in

Mixture modeling with normalizing flows for spherical density estimation

  • Regular Article
  • Published:
Advances in Data Analysis and Classification Aims and scope Submit manuscript

Abstract

Normalizing flows are objects used for modeling complicated probability density functions, and have attracted considerable interest in recent years. Many flexible families of normalizing flows have been developed. However, the focus to date has largely been on normalizing flows on Euclidean domains; while normalizing flows have been developed for spherical and other non-Euclidean domains, these are generally less flexible than their Euclidean counterparts. To address this shortcoming, in this work we introduce a mixture-of-normalizing-flows model to construct complicated probability density functions on the sphere. This model provides a flexible alternative to existing parametric, semiparametric, and nonparametric, finite mixture models. Model estimation is performed using the expectation maximization algorithm and a variant thereof. The model is applied to simulated data, where the benefit over the conventional (single component) normalizing flow is verified. The model is then applied to two real-world data sets of events occurring on the surface of Earth; the first relating to earthquakes, and the second to terrorist activity. In both cases, we see that the mixture-of-normalizing-flows model yields a good representation of the density of event occurrence.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3

Similar content being viewed by others

Notes

  1. Note that maximize \(Q_g(\Theta _g; \hat{\Theta }_g^{(t)})\) is equivalent to minimizing \(-Q_g(\Theta _g; \hat{\Theta }_g^{(t)})\) which is achieved with SGD.

  2. https://www.start.umd.edu/gtd/.

References

  • Aragam B, Dan C, Xing EP, Ravikumar P (2020) Identifiability of nonparametric mixture models and Bayes optimal clustering. Ann Stat 48:2277–2302

    Article  MathSciNet  Google Scholar 

  • Banerjee A, Dhillon IS, Ghosh J, Sra S (2005) Clustering on the unit hypersphere using von Mises-Fisher Distributions. J Mach Learn Res 6:1345–1382

    MathSciNet  Google Scholar 

  • Barndorff-Nielsen O (1965) Identifiability of mixtures of exponential families. J Math Anal Appl 12:115–121

    Article  MathSciNet  Google Scholar 

  • Biernacki C, Celeux G, Govaert G (2000) Assessing a mixture model for clustering with the integrated completed likelihood. IEEE Trans Pattern Anal Mach Intell 22:719–725

    Article  Google Scholar 

  • Bishop CM (2006) Pattern recognition and machine learning. Springer, New York, NY

    Google Scholar 

  • Bona-Pellissier J, Bachoc F, Malgouyres F (2021) Parameter identifiability of a deep feedforward ReLU neural network. arXiv: 2112.12982

  • Bordes L, Vandekerkhove P (2010) Semiparametric two-component mixture model with a known component: a class of asymptotically normal estimators. Math Methods Statist 19:22–41

    Article  MathSciNet  Google Scholar 

  • Chang GT, Walther G (2007) Clustering with mixtures of log-concave distributions. Comput Stat Data Anal 51:6242–6251

    Article  MathSciNet  Google Scholar 

  • Chib S (1995) Marginal likelihood from the Gibbs output. J Am Stat Assoc 90:1313–1321

    Article  MathSciNet  Google Scholar 

  • Ciobanu S (2021) Mixtures of normalizing flows. In: Shi Y, Hu G, Yuan Q, and Goto T, (eds) Proceedings of ISCA 34th international conference on computer applications in industry and engineering, pp. 82–90

  • Cohen S, Amos B, Lipman Y (2021) Riemannian convex potential maps. In: Meila M and Zhang T, (eds) Proceedings of the 38th international conference on machine learning, pp. 2028–2038

  • Cornish R, Caterini A, Deligiannidis G, Doucet A (2020) Relaxing bijectivity constraints with continuously indexed normalising flows. In: Daumé H and Singh A, (eds) Proceedings of the 37th international conference on machine learning research, pp. 2133–2143

  • Dacunha-Castelle D, Gassiat E (1999) Testing the order of a model using locally conic parametrization: population mixtures and stationary ARMA processes. Ann Stat 27:1178–1209

    Article  MathSciNet  Google Scholar 

  • Dinh L, Sohl-Dickstein J, Pascanu R, Larochelle H (2019) A RAD approach to deep mixture models. Online: https://openreview.net/pdf?id=HJeZNLIt_4

  • Drton M, Plummer M (2017) A Bayesian information criterion for singular models. J R Stat Soc B 79:323–380

    Article  MathSciNet  Google Scholar 

  • D’Haultfoeuille X, Février P (2015) Identification of mixture models using support variations. J Econ 189:70–82

    Article  MathSciNet  Google Scholar 

  • Frühwirth-Schnatter S, Pyne S (2010) Bayesian inference for finite mixtures of univariate and multivariate skew-normal and skew-\(t\) distributions. Biostatistics 11:317–336

    Article  PubMed  Google Scholar 

  • Gassiat E (2002) Likelihood ratio inequalities with applications to various mixtures. Annales de l’Institut Henri Poincare B 38:897–906

    Article  ADS  MathSciNet  Google Scholar 

  • Gemici M, Rezende DJ, Mohamed S (2016) Normalizing flows on Riemannian manifolds. arXiv: 1611.02304

  • Gopal S, Yang Y (2014) Von Mises-Fisher clustering models. In: Xing EP and Jebara T, (eds) Proceedings of the 31st international conference on machine learning, pp. 154–162

  • Green PJ (1995) Reversible jump Markov chain Monte Carlo computation and Bayesian model determination. Biometrika 82:711–732

    Article  MathSciNet  Google Scholar 

  • Hall P, Neeman A, Pakyari R, Elmore R (2005) Nonparametric inference in multivariate mixtures. Biometrika 92:667–678

    Article  MathSciNet  Google Scholar 

  • Hall P, Zhou X-H (2003) Nonparametric estimation of component distributions in a multivariate mixture. Ann Stat 31:201–224

    Article  MathSciNet  Google Scholar 

  • Hejblum BP, Alkhassim C, Gottardo R, Caron F, Thiébaut R (2019) Sequential Dirichlet process mixtures of multivariate skew \(t\)-distributions for model-based clustering of flow cytometry data. Ann Appl Stat 13:638–660

    Article  MathSciNet  Google Scholar 

  • Holzmann H, Munk A, Gneiting T (2006) Identifiability of finite mixtures of elliptical distributions. Scand J Stat 33:753–763

    Article  MathSciNet  Google Scholar 

  • Huang CW, Krueger D, Lacoste A, Courville A (2018) Neural autoregressive flows. In: Dy J and Krause A, (eds) Proceedings of the 35th international conference on machine learning, pp. 2078–2087

  • Hunter DR, Wang S, Hettmansperger TP (2007) Inference for mixtures of symmetric distributions. Ann Stat 35:224–251

    Article  MathSciNet  Google Scholar 

  • Izmailov P, Kirichenko P, Finzi M, and Wilson AG (2020) Semi-supervised learning with normalizing flows. In: Daumé H and Singh A, (eds) Proceedings of the 37th international conference on machine learning, pp. 3165–3176

  • Jaini P, Selby KA, and Yu Y (2019) Sum-of-squares polynomial flow. In: Chaudhuri K and Salakhutdinov R, (eds) Proceedings of the 36th international conference on machine learning, pp. 3009–3018

  • Jupp PE (1995) Some applications of directional statistics to astronomy. In: Tiit EM, Kollo T, Niemi H (eds) New trends in probability and statistics, vol 3. De Gruyter, Utrecht, The Netherlands, pp 123–133

    Google Scholar 

  • Kobyzev I, Prince SJ, Brubaker MA (2020) Normalizing flows: an introduction and review of current methods. IEEE Trans Pattern Anal Mach Intell 43:3964–3979

    Article  Google Scholar 

  • Levine M, Hunter DR, Chauveau D (2011) Maximum smoothed likelihood for multivariate mixtures. Biometrika 98:403–416

    Article  MathSciNet  Google Scholar 

  • Lin T-I, Ho HJ, Lee C-R (2014) Flexible mixture modelling using the multivariate skew-\(t\)-normal distribution. Stat Comput 24:531–546

    Article  MathSciNet  Google Scholar 

  • Mardia KV, Barber S, Burdett PM, Kent JT, Hamelryck T (2022) Mixture models for spherical data with applications to protein bioinformatics. In: SenGupta A, Arnold B (eds) Directional statistics for innovative applications. Springer, Singapore, pp 15–32

    Chapter  Google Scholar 

  • Marzouk Y, Moselhy T, Parno M, Spantini A (2016) Sampling via measure transport: an introduction. In: Ghanem R, Higdon D, Owhadi H (eds) Handbook of uncertainty quantification. Springer International Publishing, Cham, Switzerland, pp 1–41

    Google Scholar 

  • Mathieu E and Nickel M (2020) Riemannian continuous normalizing flows. In: Advances in neural information processing systems p. 33

  • McLachlan GJ (1987) On bootstrapping the likelihood ratio test stastistic for the number of components in a normal mixture. J R Stat Soc C 36:318–324

    Google Scholar 

  • Ng TLJ, Zammit-Mangion A (2022) Spherical Poisson point process intensity function modeling and estimation with measure transport. Spatial Stat 50:100629

    Article  MathSciNet  Google Scholar 

  • Ng TLJ, Zammit-Mangion A (2023) Non-homogeneous Poisson process intensity modeling and estimation using measure transport. Bernoulli 29:815–838

    Article  MathSciNet  Google Scholar 

  • Papamakarios G, Nalisnick E, Rezende DJ, Mohamed S, Lakshminarayanan B (2021) Normalizing flows for probabilistic modeling and inference. J Mach Learn Res 22:1–64

    MathSciNet  Google Scholar 

  • Papamakarios G, Pavlakou T, and Murray I (2017) Masked autoregressive flow for density estimation. In: Advances in neural information processing systems p. 30

  • Paszke A, Gross S, Chintala S, Chanan G, Yang E, DeVito Z, Lin Z, Desmaison A, Antiga L, and Lerer A (2017) Automatic differentiation in PyTorch. In: Advances in neural information processing systems p. 30: workshop on autodiff

  • Peel D, Whiten WJ, McLachlan GJ (2001) Fitting mixtures of Kent distributions to aid in joint set identification. J Am Stat Assoc 96:56–63

    Article  MathSciNet  Google Scholar 

  • Phuong M and Lampert CH (2020) Functional vs. parametric equivalence of ReLU networks. In: International conference on learning representations. Online: https://openreview.net/forum?id=Bylx-TNKvH

  • Pires, G. and Figueiredo, M. A. T. (2020). Variational mixtures of normalizing flows. In European Symposium on Artificial Neural Networks (ESANN). https://www.esann.org/sites/default/files/proceedings/2020/ES2020-188.pdf

  • Rezende D.J., Papamakarios G, Racaniere S, Albergo M, Kanwar G, Shanahan P, and Cranmer K (2020) Normalizing flows on tori and spheres. In: Daumé H and Singh A., (eds.) Proceedings of the 37th international conference on machine learning, pp. 8083–8092

  • Rosvold E, Buhaug H (2021) GDIS, a global dataset of geocoded disaster locations. Sci Data 8:1–7

    Article  Google Scholar 

  • Rousseau J, Mengersen K (2011) Asymptotic behaviour of the posterior distribution in overfitted mixture models. J R Stat Soc B 73:689–710

    Article  MathSciNet  Google Scholar 

  • Samdani R, Chang M-W, and Roth D (2012) Unified expectation maximization. In: Fosler-Lussier E, Riloff E, and Bangalore S, (eds) Proceedings of the 2012 conference of the North American chapter of the association for computational Linguistics human language technologies, pp. 688–698

  • Sei T (2013) A Jacobian inequality for gradient maps on the sphere and its application to directional statistics. Commun Stat Theory Methods 42:2525–2542

    Article  MathSciNet  Google Scholar 

  • Spiegelhalter DJ, Best NG, Carlin BP, Van Der Linde A (2002) Bayesian measures of model complexity and fit. J R Stat Soc B 64:583–639

    Article  MathSciNet  Google Scholar 

  • Teicher H (1961) Identifiability of mixtures. Ann Math Stat 32:244–248

    Article  MathSciNet  Google Scholar 

  • Teicher H (1963) Identifiability of finite mixtures. Ann Math Stat 34:1265–1269

    Article  MathSciNet  Google Scholar 

  • van Havre Z, White N, Rousseau J, Mengersen K (2015) Overfitting Bayesian mixture models with an unknown number of components. PLoS ONE 10:e0131739

    Article  PubMed  PubMed Central  Google Scholar 

Download references

Acknowledgements

Andrew Zammit-Mangion’s research was supported by the Australian Research Council (ARC) Discovery Early Career Research Award DE180100203. The authors would also like to thank two anonymous reviewers for their comments which helped improve the quality of the manuscript.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Tin Lok James Ng.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Ng, T.L.J., Zammit-Mangion, A. Mixture modeling with normalizing flows for spherical density estimation. Adv Data Anal Classif 18, 103–120 (2024). https://doi.org/10.1007/s11634-023-00561-7

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11634-023-00561-7

Keywords

Mathematics Subject Classification

Navigation