Engineering Algorithms for Large Data Sets
For many applications, the data sets to be processed grow much faster than can be handled with the traditionally available algorithms. We therefore have to come up with new, dramatically more scalable approaches. In order to do that, we have to bring together know-how from the application, from traditional algorithm theory, and on low level aspects like parallelism, memory hierarchies, energy efficiency, and fault tolerance. The methodology of algorithm engineering with its emphasis on realistic models and its cycle of design, analysis, implementation, and experimental evaluation can serve as a glue between these requirements. This paper outlines the general challenges and gives examples from my work like sorting, full text indexing, graph algorithms, and database engines.
KeywordsFault Tolerance Memory Hierarchy Parallel Disk Disk Array Array Construction
Unable to display preview. Download preview PDF.
- 2.Beckmann, A., Meyer, Sanders, P., Singler, J.: Energy-efficient sorting using solid state disks. In: 1st International Green Computing Conference, pp. 191–202. IEEE (2010)Google Scholar
- 4.Dementiev, R., Kärkkäinen, J., Mehnert, J., Sanders, P.: Better external memory suffix array construction. Special issue on Alenex 2005. ACM Journal of Experimental Algorithmics 12 (2008)Google Scholar
- 5.Geisberger, R., Sanders, P., Schultes, D., Vetter, C.: Exact routing in large road networks using contraction hierarchies. Transportation Science (2012)Google Scholar
- 9.Rahn, M., Sanders, P., Singler, J.: Scalable distributed-memory external sorting. In: 26th IEEE International Conference on Data Engineering, pp. 685–688 (2010)Google Scholar
- 10.Sanders, P.: Reconciling simplicity and realism in parallel disk models. Special Issue on Parallel Data Intensive Algorithms and Applications. Parallel Computing 28(5), 705–723 (2002)Google Scholar
- 11.Sanders, P.: Asynchronous scheduling of redundant disk arrays. IEEE Transactions on Computers 52(9), 1170–1184 (2003); Short version in 12th ACM Symposium on Parallel Algorithms and Architectures, pp. 89–98 (2000)Google Scholar