A New Subspace-Based Algorithm for Efficient Spatially Adaptive Sparse Grid Regression, Classification and Multi-evaluation
As data has become easier to collect and precise sensors have become ubiquitous, data mining with large data sets has become an important problem. Because sparse grid data mining scales only linearly in the number of data points, large data mining problems have been successfully addressed with this method. Still, highly efficient algorithms are required to process very large problems within a reasonable amount of time.
In this paper, we introduce a new algorithm that can be used to solve regression and classification problems on spatially adaptive sparse grids. Additionally, our approach can be used to efficiently evaluate a spatially adaptive sparse grid function at multiple points in the domain. In contrast to other algorithms for these applications, our algorithm fits well to modern hardware and performs only few unnecessary basis function evaluations.
We evaluated our algorithm by comparing it to a highly efficient implementation of a streaming algorithm for sparse grid regression. In our experiments, we observed speedups of up to 7×, being faster in all experiments that we performed.
This work was financially supported by the Juniorprofessurenprogramm of the Landesstiftung Baden-Württemberg.
- 1.J.K. Adelman-McCarthy et al., The fifth data release of the Sloan digital sky survey. Astrophys. J. Suppl. Ser. 172(2), 634 (2007)Google Scholar
- 4.H.-J. Bungartz, D. Pflüger, S. Zimmer, Adaptive sparse grid techniques for data mining, in Modelling, Simulation and Optimization of Complex Processes 2006, Proceedings of the International Conference HPSC, Hanoi, ed. by H. Bock, E. Kostina, X. Hoang, R. Rannacher (Springer, 2008), pp. 121–130Google Scholar
- 5.G. Buse, D. Pflüger, R. Jacob, Efficient pseudorecursive evaluation schemes for non-adaptive sparse grids, in Sparse Grids and Applications – Munich 2012. Lecture Notes in Computational Science and Engineering (Springer, Heidelberg, 2013), http://link.springer.com/chapter/10.1007%2F978-3-319-04537-5_1Google Scholar
- 7.J. Garcke, Maschinelles Lernen durch Funktionsrekonstruktion mit verallgemeinerten dünnen Gittern. PhD thesis, Universität Bonn, Institut für Numerische Simulation (2004)Google Scholar
- 9.A. Heinecke, D. Pflüger, Multi- and many-core data mining with adaptive sparse grids, in Proceedings of the 8th ACM International Conference on Computing Frontiers CF ’11, New York (ACM, 2011), pp. 29:1–29:10Google Scholar
- 11.A. Heinecke, R. Karlstetter, D. Pflüger, H.-J. Bungartz, Data mining on vast datasets as a cluster system benchmark. Concurr. Comput. Pract. Exp. (2015). ISSN: 1532-0634, doi:10.1002/cpe.3514
- 12.Intel Cooperation, Intel ®; 64 and IA-32 Architectures Optimization Reference Manual (2014), http://www.intel.com/content/dam/www/public/us/en/documents/manuals/64-ia-32-architectures-optimization-manual.pdf Google Scholar
- 13.A. Murarasu, G. Buse, J. Weidendorfer, D. Pflüger, A. Bode, fastsg: a fast routines library for sparse grids, in Proceedings of the International Conference on Computational Science (ICCS 2012), Omaha (Procedia Computer Science, 2012)Google Scholar
- 15.D. Pflüger, Spatially adaptive refinement, in Sparse Grids and Applications, ed. by J. Garcke, M. Griebel. Lecture Notes in Computational Science and Engineering (Springer, Berlin/Heidelberg, 2012), pp. 243–262Google Scholar
- 16.J.R. Shewchuk, An introduction to the conjugate gradient method without the agonizing pain. Technical report, School of Computer Science, Carnegie Mellon University (1994)Google Scholar