Abstract
Normalizing flows are objects used for modeling complicated probability density functions, and have attracted considerable interest in recent years. Many flexible families of normalizing flows have been developed. However, the focus to date has largely been on normalizing flows on Euclidean domains; while normalizing flows have been developed for spherical and other non-Euclidean domains, these are generally less flexible than their Euclidean counterparts. To address this shortcoming, in this work we introduce a mixture-of-normalizing-flows model to construct complicated probability density functions on the sphere. This model provides a flexible alternative to existing parametric, semiparametric, and nonparametric, finite mixture models. Model estimation is performed using the expectation maximization algorithm and a variant thereof. The model is applied to simulated data, where the benefit over the conventional (single component) normalizing flow is verified. The model is then applied to two real-world data sets of events occurring on the surface of Earth; the first relating to earthquakes, and the second to terrorist activity. In both cases, we see that the mixture-of-normalizing-flows model yields a good representation of the density of event occurrence.
Similar content being viewed by others
Notes
Note that maximize \(Q_g(\Theta _g; \hat{\Theta }_g^{(t)})\) is equivalent to minimizing \(-Q_g(\Theta _g; \hat{\Theta }_g^{(t)})\) which is achieved with SGD.
References
Aragam B, Dan C, Xing EP, Ravikumar P (2020) Identifiability of nonparametric mixture models and Bayes optimal clustering. Ann Stat 48:2277–2302
Banerjee A, Dhillon IS, Ghosh J, Sra S (2005) Clustering on the unit hypersphere using von Mises-Fisher Distributions. J Mach Learn Res 6:1345–1382
Barndorff-Nielsen O (1965) Identifiability of mixtures of exponential families. J Math Anal Appl 12:115–121
Biernacki C, Celeux G, Govaert G (2000) Assessing a mixture model for clustering with the integrated completed likelihood. IEEE Trans Pattern Anal Mach Intell 22:719–725
Bishop CM (2006) Pattern recognition and machine learning. Springer, New York, NY
Bona-Pellissier J, Bachoc F, Malgouyres F (2021) Parameter identifiability of a deep feedforward ReLU neural network. arXiv: 2112.12982
Bordes L, Vandekerkhove P (2010) Semiparametric two-component mixture model with a known component: a class of asymptotically normal estimators. Math Methods Statist 19:22–41
Chang GT, Walther G (2007) Clustering with mixtures of log-concave distributions. Comput Stat Data Anal 51:6242–6251
Chib S (1995) Marginal likelihood from the Gibbs output. J Am Stat Assoc 90:1313–1321
Ciobanu S (2021) Mixtures of normalizing flows. In: Shi Y, Hu G, Yuan Q, and Goto T, (eds) Proceedings of ISCA 34th international conference on computer applications in industry and engineering, pp. 82–90
Cohen S, Amos B, Lipman Y (2021) Riemannian convex potential maps. In: Meila M and Zhang T, (eds) Proceedings of the 38th international conference on machine learning, pp. 2028–2038
Cornish R, Caterini A, Deligiannidis G, Doucet A (2020) Relaxing bijectivity constraints with continuously indexed normalising flows. In: Daumé H and Singh A, (eds) Proceedings of the 37th international conference on machine learning research, pp. 2133–2143
Dacunha-Castelle D, Gassiat E (1999) Testing the order of a model using locally conic parametrization: population mixtures and stationary ARMA processes. Ann Stat 27:1178–1209
Dinh L, Sohl-Dickstein J, Pascanu R, Larochelle H (2019) A RAD approach to deep mixture models. Online: https://openreview.net/pdf?id=HJeZNLIt_4
Drton M, Plummer M (2017) A Bayesian information criterion for singular models. J R Stat Soc B 79:323–380
D’Haultfoeuille X, Février P (2015) Identification of mixture models using support variations. J Econ 189:70–82
Frühwirth-Schnatter S, Pyne S (2010) Bayesian inference for finite mixtures of univariate and multivariate skew-normal and skew-\(t\) distributions. Biostatistics 11:317–336
Gassiat E (2002) Likelihood ratio inequalities with applications to various mixtures. Annales de l’Institut Henri Poincare B 38:897–906
Gemici M, Rezende DJ, Mohamed S (2016) Normalizing flows on Riemannian manifolds. arXiv: 1611.02304
Gopal S, Yang Y (2014) Von Mises-Fisher clustering models. In: Xing EP and Jebara T, (eds) Proceedings of the 31st international conference on machine learning, pp. 154–162
Green PJ (1995) Reversible jump Markov chain Monte Carlo computation and Bayesian model determination. Biometrika 82:711–732
Hall P, Neeman A, Pakyari R, Elmore R (2005) Nonparametric inference in multivariate mixtures. Biometrika 92:667–678
Hall P, Zhou X-H (2003) Nonparametric estimation of component distributions in a multivariate mixture. Ann Stat 31:201–224
Hejblum BP, Alkhassim C, Gottardo R, Caron F, Thiébaut R (2019) Sequential Dirichlet process mixtures of multivariate skew \(t\)-distributions for model-based clustering of flow cytometry data. Ann Appl Stat 13:638–660
Holzmann H, Munk A, Gneiting T (2006) Identifiability of finite mixtures of elliptical distributions. Scand J Stat 33:753–763
Huang CW, Krueger D, Lacoste A, Courville A (2018) Neural autoregressive flows. In: Dy J and Krause A, (eds) Proceedings of the 35th international conference on machine learning, pp. 2078–2087
Hunter DR, Wang S, Hettmansperger TP (2007) Inference for mixtures of symmetric distributions. Ann Stat 35:224–251
Izmailov P, Kirichenko P, Finzi M, and Wilson AG (2020) Semi-supervised learning with normalizing flows. In: Daumé H and Singh A, (eds) Proceedings of the 37th international conference on machine learning, pp. 3165–3176
Jaini P, Selby KA, and Yu Y (2019) Sum-of-squares polynomial flow. In: Chaudhuri K and Salakhutdinov R, (eds) Proceedings of the 36th international conference on machine learning, pp. 3009–3018
Jupp PE (1995) Some applications of directional statistics to astronomy. In: Tiit EM, Kollo T, Niemi H (eds) New trends in probability and statistics, vol 3. De Gruyter, Utrecht, The Netherlands, pp 123–133
Kobyzev I, Prince SJ, Brubaker MA (2020) Normalizing flows: an introduction and review of current methods. IEEE Trans Pattern Anal Mach Intell 43:3964–3979
Levine M, Hunter DR, Chauveau D (2011) Maximum smoothed likelihood for multivariate mixtures. Biometrika 98:403–416
Lin T-I, Ho HJ, Lee C-R (2014) Flexible mixture modelling using the multivariate skew-\(t\)-normal distribution. Stat Comput 24:531–546
Mardia KV, Barber S, Burdett PM, Kent JT, Hamelryck T (2022) Mixture models for spherical data with applications to protein bioinformatics. In: SenGupta A, Arnold B (eds) Directional statistics for innovative applications. Springer, Singapore, pp 15–32
Marzouk Y, Moselhy T, Parno M, Spantini A (2016) Sampling via measure transport: an introduction. In: Ghanem R, Higdon D, Owhadi H (eds) Handbook of uncertainty quantification. Springer International Publishing, Cham, Switzerland, pp 1–41
Mathieu E and Nickel M (2020) Riemannian continuous normalizing flows. In: Advances in neural information processing systems p. 33
McLachlan GJ (1987) On bootstrapping the likelihood ratio test stastistic for the number of components in a normal mixture. J R Stat Soc C 36:318–324
Ng TLJ, Zammit-Mangion A (2022) Spherical Poisson point process intensity function modeling and estimation with measure transport. Spatial Stat 50:100629
Ng TLJ, Zammit-Mangion A (2023) Non-homogeneous Poisson process intensity modeling and estimation using measure transport. Bernoulli 29:815–838
Papamakarios G, Nalisnick E, Rezende DJ, Mohamed S, Lakshminarayanan B (2021) Normalizing flows for probabilistic modeling and inference. J Mach Learn Res 22:1–64
Papamakarios G, Pavlakou T, and Murray I (2017) Masked autoregressive flow for density estimation. In: Advances in neural information processing systems p. 30
Paszke A, Gross S, Chintala S, Chanan G, Yang E, DeVito Z, Lin Z, Desmaison A, Antiga L, and Lerer A (2017) Automatic differentiation in PyTorch. In: Advances in neural information processing systems p. 30: workshop on autodiff
Peel D, Whiten WJ, McLachlan GJ (2001) Fitting mixtures of Kent distributions to aid in joint set identification. J Am Stat Assoc 96:56–63
Phuong M and Lampert CH (2020) Functional vs. parametric equivalence of ReLU networks. In: International conference on learning representations. Online: https://openreview.net/forum?id=Bylx-TNKvH
Pires, G. and Figueiredo, M. A. T. (2020). Variational mixtures of normalizing flows. In European Symposium on Artificial Neural Networks (ESANN). https://www.esann.org/sites/default/files/proceedings/2020/ES2020-188.pdf
Rezende D.J., Papamakarios G, Racaniere S, Albergo M, Kanwar G, Shanahan P, and Cranmer K (2020) Normalizing flows on tori and spheres. In: Daumé H and Singh A., (eds.) Proceedings of the 37th international conference on machine learning, pp. 8083–8092
Rosvold E, Buhaug H (2021) GDIS, a global dataset of geocoded disaster locations. Sci Data 8:1–7
Rousseau J, Mengersen K (2011) Asymptotic behaviour of the posterior distribution in overfitted mixture models. J R Stat Soc B 73:689–710
Samdani R, Chang M-W, and Roth D (2012) Unified expectation maximization. In: Fosler-Lussier E, Riloff E, and Bangalore S, (eds) Proceedings of the 2012 conference of the North American chapter of the association for computational Linguistics human language technologies, pp. 688–698
Sei T (2013) A Jacobian inequality for gradient maps on the sphere and its application to directional statistics. Commun Stat Theory Methods 42:2525–2542
Spiegelhalter DJ, Best NG, Carlin BP, Van Der Linde A (2002) Bayesian measures of model complexity and fit. J R Stat Soc B 64:583–639
Teicher H (1961) Identifiability of mixtures. Ann Math Stat 32:244–248
Teicher H (1963) Identifiability of finite mixtures. Ann Math Stat 34:1265–1269
van Havre Z, White N, Rousseau J, Mengersen K (2015) Overfitting Bayesian mixture models with an unknown number of components. PLoS ONE 10:e0131739
Acknowledgements
Andrew Zammit-Mangion’s research was supported by the Australian Research Council (ARC) Discovery Early Career Research Award DE180100203. The authors would also like to thank two anonymous reviewers for their comments which helped improve the quality of the manuscript.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Ng, T.L.J., Zammit-Mangion, A. Mixture modeling with normalizing flows for spherical density estimation. Adv Data Anal Classif 18, 103–120 (2024). https://doi.org/10.1007/s11634-023-00561-7
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11634-023-00561-7