Abstract
There is a growing demand for performing larger-scale Bayesian inference tasks, arising from greater data availability and higher-dimensional model parameter spaces. In this work we present parallelization strategies for the methodology of integrated nested Laplace approximations (INLA), a popular framework for performing approximate Bayesian inference on the class of Latent Gaussian models. Our approach makes use of nested thread-level parallelism, a parallel line search procedure using robust regression in INLA’s optimization phase and the state-of-the-art sparse linear solver PARDISO. We leverage mutually independent function evaluations in the algorithm as well as advanced sparse linear algebra techniques. This way we can flexibly utilize the power of today’s multi-core architectures. We demonstrate the performance of our new parallelization scheme on a number of different real-world applications. The introduction of parallelism leads to speedups of a factor 10 and more for all larger models. Our work is already integrated in the current version of the open-source R-INLA package, making its improved performance conveniently available to all users.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.References
Arisido, M.W., Gaetan, C., Zanchettin, D., Rubino, A.: A Bayesian hierarchical approach for spatial analysis of climate model bias in multi-model ensembles. Stoch. Environ. Res. Risk Assess. 31(10), 2645–2657 (2017). https://doi.org/10.1007/s00477-017-1383-2
Ascher, U.M., Greif, C.: A first course on numerical methods. SIAM (2011). https://doi.org/10.1137/9780898719987
Atkinson, A.C., Riani, M., Riani, M.: Robust diagnostic regression analysis, Volume 2. Springer (2000). https://doi.org/10.1007/978-1-4612-1160-0
Bakka, H., Rue, H., Fuglstad, G.A., Riebler, A., Bolin, D., Illian, J., Krainski, E., Simpson, D., Lindgren, F.: Spatial modelling with R-INLA: a review. WIREs Comput. Stat. 10(6), e1443 (2018). https://doi.org/10.1002/wics.1443
Batomen, B., Irving, H., Carabali, M., Carvalho, M.S., Ruggiero, E.D., Brown, P.: Vulnerable road-user deaths in Brazil: a Bayesian hierarchical model for spatial-temporal analysis. Int. J. Injury Cont. Safety Promot. (2020). https://doi.org/10.1080/17457300.2020.1818788
Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. J. Mach. Learn. Res. 18, 1–43 (2018). https://doi.org/10.5555/3122009.3242010
Bhatt, S., Weiss, D., Cameron, E., Bisanzio, D., Mappin, B., Dalrymple, U., Battle, K., Moyes, C., Henry, A., Eckhoff, P., et al.: The effect of malaria control on plasmodium falciparum in Africa between 2000 and 2015. Nature 526(7572), 207–211 (2015). https://doi.org/10.1038/nature15535
Bichot, C.-E., Siarry, P.: Graph partitioning. Wiley, Hobroken (2013). https://doi.org/10.1007/978-3-319-63962-8_312-1
Bollhöfer, M., Schenk, O., Janalik, R., Hamm, S., Gullapalli, K.: State-of-the-art sparse direct solvers. In Parallel algorithms in computational science and engineering, pp. 3–33. Springer. (2020) https://doi.org/10.1007/978-3-030-43736-7_1
Coll, M., Pennino, M.G., Steenbeek, J., Solé, J., Bellido, J.M.: Predicting marine species distributions: complementarity of food-web and bayesian hierarchical modelling approaches. Ecol. Modell. 405, 86–101 (2019). https://doi.org/10.1016/j.ecolmodel.2019.05.005
Congdon, P.: Applied Bayesian modelling, Volume 595. Wiley, Hobroken (2014). https://doi.org/10.1002/9781118895047
Davis, T.A.: Direct methods for sparse linear systems. SIAM (2006). https://doi.org/10.1137/19780898718881
de Rivera, O.R., Blangiardo, M., López-Quílez, A., Martín-Sanz, I.: Species distribution modelling through Bayesian hierarchical approach. Theoret. Ecol. 12(1), 49–59 (2019). https://doi.org/10.1007/s12080-018-0387-y
Demmel, J.W.: Applied numerical linear algebra. Soci. Ind. Appl. Math. https://doi.org/10.1137/19781611971446 (1997)
Diaz, J.M., Pophale, S., Hernandez, O., Bernholdt, D.E., Chandrasekaran, S. (2018) Openmp 4.5 validation and verification suite for device offload. In B. R. de Supinski, P. Valero-Lara, X. Martorell, S. Mateo Bellido, and J. Labarta (Eds.), Evolving OpenMP for Evolving Architectures, pp. 82–95. Springer, Cham https://www.openmp.org
Fattah, E.A., Niekerk, J.V., Rue, H.: Smart gradient - an adaptive technique for improving gradient estimation. Found. Data Sci. 4(1), 123–136 (2022). https://doi.org/10.3934/fods.2021037
George, A.: Nested dissection of a regular finite element mesh. SIAM J. Numer. Anal. 10(2), 345–363 (1973). https://doi.org/10.1137/0710032
George, A., Liu, J.W.: The evolution of the minimum degree ordering algorithm. SIAM Rev. 31(1), 1–19 (1989). https://doi.org/10.1137/1031001
Heath, M.T., Ng, E., Peyton, B.W.: Parallel algorithms for sparse linear systems. SIAM Rev. 33(3), 420–460 (1991). https://doi.org/10.1137/1033099
Henderson, R., Shimakura, S., Gorst, D.: Modeling spatial variation in leukemia survival data. J. Am. Stat. Assoc. 97(460), 965–972 (2002). https://doi.org/10.1198/016214502388618753
Isaac, N.J., Jarzyna, M.A., Keil, P., Dambly, L.I., Boersch-Supan, P.H., Browning, E., Freeman, S.N., Golding, N., Guillera-Arroita, G., Henrys, P.A., et al.: Data integration for large-scale models of species distributions. Trend. Ecol. Evolut. 35(1), 56–67 (2020). https://doi.org/10.1016/j.tree.2019.08.006
Karypis, G., Kumar, V.: A fast and high quality multilevel scheme for partitioning irregular graphs. SIAM J. Scient. Comp. 20(1), 359–392 (1998). https://doi.org/10.5555/305219.305248
Konstantinoudis, G., Padellini, T., Bennett, J., Davies, B., Ezzati, M., Blangiardo, M.: Long-term exposure to air-pollution and covid-19 mortality in England: a hierarchical spatial analysis. Environ. Int. 146, 106316 (2021). https://doi.org/10.1016/j.envint.2020.106316
Kontis, V., Bennett, J.E., Rashid, T., Parks, R.M., Pearson-Stuttard, J., Guillot, M., Asaria, P., Zhou, B., Battaglini, M., Corsetti, G., et al.: Magnitude, demographics and dynamics of the effect of the first wave of the covid-19 pandemic on all-cause mortality in 21 industrialized countries. Nat. Med. 26(12), 1919–1928 (2020) https://www.nature.com/articles/s41591-020-1112-0
Krainski, E.T., Gómez-Rubio, V., Bakka, H., Lenzi, A., Castro-Camilio, D., Simpson, D., Lindgren, F., Rue, H. (2018, December) Advanced spatial modeling with stochastic partial differential equations using R and INLA. CRC press, Cambridge. Github version www.r-inla.org/spde-book
LeVeque, R.J.: Finite difference methods for ordinary and partial differential equations: steady-state and time-dependent problems. SIAM 10(1137/1), 9780898717839 (2007)
Li, S., Ahmed, S., Klimeck, G., Darve, E.: Computing entries of the inverse of a sparse matrix using the FIND algorithm. J. Comput. Phys. 227(22), 9408–9427 (2008). https://doi.org/10.1016/j.jcp.2008.06.033
Lillini, R., Tittarelli, A., Bertoldi, M., Ritchie, D., Katalinic, A., Pritzkuleit, R., Launoy, G., Launay, L., Guillaume, E., Žagar, T., et al.: Water and soil pollution: ecological environmental study methodologies useful for public health projects. a literature review. Rev. Environ. Contaminat. Toxicol. 256, 179–214 (2021). https://doi.org/10.1007/398_2020_58
Lindenmayer, D., Taylor, C., Blanchard, W.: Empirical analyses of the factors influencing fire severity in southeastern australia. Ecosphere 12(8), e03721 (2021). https://doi.org/10.1002/ecs2.3721
Lindgren, F., Bolin, D., Rue, H.: The SPDE approach for gaussian and non-gaussian fields: 10 years and still running. Spat. Stat. (2022). https://doi.org/10.1016/j.spasta.2022.100599
Lindgren, F., Rue, H., Lindström, J.: An explicit link between gaussian fields and gaussian markov random fields: the stochastic partial differential equation approach. J. Royal Stat. Soc.: Series B (Stat. Methodol.) 73(4), 423–498 (2011). https://doi.org/10.1111/j.1467-9868.2011.00777.x
Lu, N., Liang, S., Huang, G., Qin, J., Yao, L., Wang, D., Yang, K.: Hierarchical Bayesian space-time estimation of monthly maximum and minimum surface air temperature. Remote Sens. Environ. 211, 48–58 (2018). https://doi.org/10.1016/j.rse.2018.04.006
Martínez-Minaya, J., Cameletti, M., Conesa, D., Pennino, M.G.: Species distribution modeling: a statistical review with focus in spatio-temporal issues. Stoch. Environ. Res. Risk Assess. 32(11), 3227–3244 (2018). https://doi.org/10.1007/s00477-018-1548-7
Martins, T.G., Simpson, D., Lindgren, F., Rue, H.: Bayesian computing with inla: new features. Comput. Stat. & Data Anal. 67, 68–83 (2013). https://doi.org/10.1016/j.csda.2013.04.014
Mejia, A.F., Yue, Y., Bolin, D., Lindgren, F., Lindquist, M.A.: A bayesian general linear modeling approach to cortical surface FMRI data analysis. J. Am. Stat. Assoc. 115(530), 501–520 (2020). https://doi.org/10.1080/01621459.2019.1611582
Mielke, K.P., Claassen, T., Busana, M., Heskes, T., Huijbregts, M.A., Koffijberg, K., Schipper, A.M.: Disentangling drivers of spatial autocorrelation in species distribution models. Ecography 43(12), 1741–1751 (2020). https://doi.org/10.1111/ecog.05134
Nocedal, J., Wright, S.: Numerical optimization. Springer, Berlin (2006). https://doi.org/10.1007/978-0-387-40065-5
Opitz, T. (2017). Latent gaussian modeling and inla: A review with focus on space-time applications. J. de la société française de statistique 158(3), 62–85. https://hal.archives-ouvertes.fr/hal-01394974
Pan, V., Reif, J. (1985) Efficient parallel solution of linear systems. In Proceedings of the seventeenth annual ACM symposium on Theory of computing, pp. 143–152. https://doi.org/10.1145/22145.22161
PARDISO (2022). Version 7.2. Lugano, Switzerland: Panua Technologies. http://www.panua.ch
Pimont, F., Fargeon, H., Opitz, T., Ruffault, J., Barbero, R., Martin-StPaul, N., Rigolot, E., Rivière, M., Dupuy, J.-L.: Prediction of regional wildfire activity in the probabilistic bayesian framework of firelihood. Ecol. Appl. 31(5), e02316 (2021). https://doi.org/10.1002/eap.2316
Pinto, G., Rousseu, F., Niklasson, M., Drobyshev, I.: Effects of human-related and biotic landscape features on the occurrence and size of modern forest fires in Sweden. Agricult. Forest Meteorol. 291, 108084 (2020). https://doi.org/10.1016/j.agrformet.2020.108084
Rousseeuw, P.J., Leroy, A.M.: Robust regression and outlier detection, Volume 589. Wiley, Hobroken (2005). https://doi.org/10.1002/0471725382
Rue, H., Held, L.: Gaussian Markov random fields: theory and applications. CRC Press, Cambridge (2005). https://doi.org/10.1201/9780203492024
Rue, H., Martino, S.: Approximate bayesian inference for hierarchical gaussian markov random field models. J. Stat. Plann. Infer. 137(10), 3177–3192 (2007). https://doi.org/10.1016/j.jspi.2006.07.016
Rue, H., Martino, S., Chopin, N.: Approximate Bayesian inference for latent Gaussian models by using integrated nested Laplace approximations. J. Royal Stat. Soc.: Series b (Stat. Methodol.) 71(2), 319–392 (2009). https://doi.org/10.1111/j.1467-9868.2008.00700.x
Rue, H., Riebler, A., Sørbye, S.H., Illian, J.B., Simpson, D.P., Lindgren, F.K.: Bayesian computing with INLA: a review. Ann. Rev. Stat. Appl. 4, 395–421 (2017). https://doi.org/10.1146/annurev-statistics-060116-054045
Rustand, D., Van Niekerk, J., Krainski, E.T., Rue, H., Proust-Lima, C. (2022) Fast and flexible inference approach for joint models of multivariate longitudinal and survival data using integrated nested Laplace approximations. arxiv:2203.06256
Saad, Y.: Iterative methods for sparse linear systems. SIAM 10(1137/1), 9780898718003 (2003)
Sanyal, S., Rochereau, T., Maesano, C.N., Com-Ruelle, L., Annesi-Maesano, I.: Long-term effect of outdoor air pollution on mortality and morbidity: a 12-year follow-up study for metropolitan france. Int. J. Environ. Res. Public Health. 15(11), 2487 (2018). https://doi.org/10.3390/ijerph15112487
Shaddick, G., Thomas, M.L., Amini, H., Broday, D., Cohen, A., Frostad, J., Green, A., Gumy, S., Liu, Y., Martin, R.V., et al.: Data integration for the assessment of population exposure to ambient air pollution for global burden of disease assessment. Environ. Sci Technol. 52(16), 9069–9078 (2018). https://doi.org/10.1021/acs.est.8b02864
Spencer, D., Yue, Y.R., Bolin, D., Ryan, S., Mejia, A.F.: Spatial bayesian GLM on the cortical surface produces reliable task activations in individuals and groups. NeuroImage (2022). https://doi.org/10.1016/j.neuroimage.2022.118908
Takahashi, K.: Formation of sparse bus impedance matrix and its application to short circuit study. In Proc. PICA Conference, June, (1973)
Toledo, S. (2003). Taucs: a library of sparse linear solvers. https://www.tau.ac.il/~stoledo/taucs/
Van Merriënboer, B., Breuleux, O., Bergeron, A., Lamblin, P. (2018) Automatic differentiation in ML: Where we are and where we should be going. Advances in neural information processing systems 31. https://proceedings.neurips.cc/paper/2018/file/770f8e448d07586afbf77bb59f698587-Paper.pdf
Van Niekerk, J., Bakka, H., Rue, H., Schenk, O.: New frontiers in Bayesian modeling using the INLA package in R. J. Stat. Softw. 100(2), 1–28 (2021). https://doi.org/10.18637/jss.v100.i02
Van Niekerk, J., Bakka, H., Rue, H., Schenk, O.: New frontiers in Bayesian modeling using the INLA package in R. J. Stat. Softw. 100(2), 1–28 (2021).https://doi.org/10.18637/jss.v100.i02
Van Niekerk, J., E. Krainski, D. Rustand, and H. Rue (2022). A new avenue for bayesian inference with INLA. arXiv preprint arXiv:2204.06797
Yannakakis, M.: Computing the minimum fill-in is np-complete. SIAM J. Algebr. Discr. Meth. 2(1), 77–79 (1981). https://doi.org/10.1137/0602010
Acknowledgements
We would like to thank Prof. D. Bolin and Dr. D. Rustand for their support with the different case studies. The work of L. Gaedke-Merzhäuser has been supported by the SNF SINERGIA Project No. CRSII5_189942.
Funding
The authors have no relevant financial or non-financial interests to disclose.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Gaedke-Merzhäuser, L., van Niekerk, J., Schenk, O. et al. Parallelized integrated nested Laplace approximations for fast Bayesian inference. Stat Comput 33, 25 (2023). https://doi.org/10.1007/s11222-022-10192-1
Received:
Accepted:
Published:
DOI: https://doi.org/10.1007/s11222-022-10192-1