Alternating Direction Method of Multipliers for Hierarchical Basis Approximators
Sparse grids have been successfully used for the mining of vast datasets with a moderate number of dimensions. Compared to established machine learning techniques like artificial neural networks or support vector machines, sparse grids provide an analytic approximant that is easier to analyze and to interpret. More important, they are based on a high-dimensional discretization of the feature space, are thus less data-dependent than conventional approaches, scale only linearly in the number of data points and are well-suited to deal with huge amounts of data. But with an increasing size of the datasets used for learning, computing times clearly can become prohibitively large for normal use, despite the linear scaling. Thus, efficient parallelization strategies have to be found to exploit the power of modern hardware. We investigate the parallelization opportunities for solving high-dimensional machine learning problems with adaptive sparse grids using the alternating direction method of multipliers (ADMM). ADMM allows us to split the initially large problem into smaller ones. They can then be solved in parallel while their reduced problem sizes can even be small enough for an explicitly assembly of the system matrices. We show the first results of the new approach using a set of problems and discuss the challenges that arise when applying ADMM to a hierarchical basis.
KeywordsSparse Grid Augmented Lagrangian Method Alternate Direction Method Memory Footprint Shared Memory System
Unable to display preview. Download preview PDF.
- 4.J. Eckstein, M. Fukushima, Some reformulations and applications of the alternating direction method of multipliers, in Large Scale Optimization: State of the Art (Springer, US, 1994), pp. 115–134Google Scholar
- 6.M. Fortin, R. Glowinski, Augmented Lagrangian methods in quadratic programming, in Augmented Lagrangian Methods: Applications to the Numerical Solution of Boundary-Value Problems. Studies in Mathematics and its Applications, vol. 15 (Springer, Berlin, 1983), pp. 1–46Google Scholar
- 9.D. Gabay, Applications of the method of multipliers to variational inequalities, in Augmented Lagrangian Methods: Applications to the Numerical Solution of Boundary-Value Problems, ed. by M. Fortin, R. Glowinski. Studies in Mathematics and Its Applications, vol. 15 (Elsevier, New York, 1983), pp. 299–331Google Scholar
- 11.J. Garcke, M. Griebel, On the Parallelization of the Sparse Grid Approach for Data Mining (Springer, Berlin, 2001), pp. 22–32Google Scholar
- 13.J. Garcke, M. Hegland, O. Nielsen, Parallelisation of sparse grids for large scale data analysis, in Computational Science — ICCS 2003, ed. by P.M.A. Sloot, D. Abramson, A.V. Bogdanov, Y.E. Gorbachev, J.J. Dongarra, A.Y. Zomaya. Lecture Notes in Computer Science, vol. 2659 (Springer, Berlin, 2003), pp. 683–692Google Scholar
- 14.R. Glowinski, A. Marroco, Sur l’approximation, par éléments finis d’ordre un, et la résolution, par pénalisation-dualité d’une classe de problèmes de dirichlet non linéaires. Revue française d’automatique, informatique, recherche opérationnelle. Analyse numérique 9(2), 41–76 (1975)Google Scholar
- 15.R. Glowinski, P.L. Tallec, Augmented Lagrangian Methods for the Solution of Variational Problems, Chap. 3 (Society for Industrial and Applied Mathematics, Philadelphia 1989), pp. 45–121Google Scholar
- 16.K. Goto, R. Van De Geijn, High-performance implementation of the level-3 BLAS. ACM Trans. Math. Softw. 35(1), 1–4 (2008) [Article 4]Google Scholar
- 18.A. Heinecke, D. Pflüger, Multi- and many-core data mining with adaptive sparse grids, in Proceedings of the 8th ACM International Conference on Computing Frontiers (ACM, New York, 2011), pp. 29:1–29:10Google Scholar
- 20.D. Pflüger, SG\(++\) (2013). http://www5.in.tum.de/SGpp
- 21.D. Pflüger, Spatially Adaptive Sparse Grids for High-Dimensional Problems (Verlag Dr. Hut, München, 2010)Google Scholar