Abstract
In this paper, we perform an empirical evaluation of the Parallel External Memory (PEM) model in the context of geometric problems. In particular, we implement the parallel distribution sweeping framework of Ajwani, Sitchinava and Zeh to solve batched 1-dimensional stabbing max problem. While modern processors consist of sophisticated memory systems (multiple levels of caches, set associativity, TLB, prefetching), we empirically show that algorithms designed in simple models, that focus on minimizing the I/O transfers between shared memory and single level cache, can lead to efficient software on current multicore architectures. Our implementation exhibits significantly fewer accesses to slow DRAM and, therefore, outperforms traditional approaches based on plane sweep and two-way divide and conquer.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Aggarwal, A., Vitter, J.S.: The input/output complexity of sorting and related problems. Communications of the ACM 31(9), 1116–1127 (1988)
Ajwani, D., Sitchinava, N.: Empirical evaluation of the parallel distribution sweeping framework on multicore architectures. CoRR abs/1306.4521 (2013)
Ajwani, D., Sitchinava, N., Zeh, N.: Geometric algorithms for private-cache chip multiprocessors. In: de Berg, M., Meyer, U. (eds.) ESA 2010, Part II. LNCS, vol. 6347, pp. 75–86. Springer, Heidelberg (2010)
Ajwani, D., Sitchinava, N., Zeh, N.: I/O-optimal distribution sweeping on private-cache chip multiprocessors. In: IPDPS, pp. 1114–1123 (2011)
Arge, L., Goodrich, M.T., Nelson, M.J., Sitchinava, N.: Fundamental parallel algorithms for private-cache chip multiprocessors. In: SPAA, pp. 197–206 (2008)
Bender, M.A., Farach-Colton, M., Fineman, J.T., Fogel, Y.R., Kuszmaul, B.C., Nelson, J.: Cache-oblivious streaming B-trees. In: SPAA, pp. 81–92 (2007)
Bentley, J.L., Ottmann, T.A.: Algorithms for reporting and counting geometric intersections. IEEE Transactions on Computers 28(9), 643–647 (1979)
Blelloch, G.E.: Prefix sums and their applications. In: Reif, J.H. (ed.) Synthesis of Parallel Algorithms, pp. 35–60. Morgan Kaufmann Publishers (1993)
Blelloch, G.E., Chowdhury, R.A., Gibbons, P.B., Ramachandran, V., Chen, S., Kozuch, M.: Provably good multicore cache performance for divide-and-conquer algorithms. In: SODA, pp. 501–510 (2008)
Blelloch, G.E., Fineman, J.T., Gibbons, P.B., Simhadri, H.V.: Scheduling irregular parallel computations on hierarchical caches. In: SPAA, pp. 355–366. ACM (2011)
Brodal, G.S., Fagerberg, R., Vinther, K.: Engineering a cache-oblivious sorting algorithm. ACM Journal of Experimental Algorithmics 12 (2007)
Chowdhury, R.A., Ramachandran, V.: The cache-oblivious gaussian elimination paradigm: Theoretical framework, parallelization and experimental evaluation. In: SPAA, pp. 71–80 (2007)
Chowdhury, R.A., Ramachandran, V.: Cache-efficient dynamic programming for multicores. In: SPAA, pp. 207–216 (2008)
Goodrich, M.T., Tsay, J.J., Vengroff, D.E., Vitter, J.S.: External-memory computational geometry. In: FOCS, pp. 714–723 (1993)
Kang, S., Ediger, D., Bader, D.A.: Algorithm engineering challenges in multicore and manycore systems. IT - Information Technology 53(6), 266–273 (2011)
Mehlhorn, K., Sanders, P.: Scanning multiple sequences via cache memory. Algorithmica 35, 75–93 (2003), 10.1007/s00453-002-0993-2
Shamos, M.I., Hoey, D.: Geometric intersection problems. In: FOCS, pp. 208–215. IEEE Computer Society Press (1976)
Singler, J., Sanders, P., Putze, F.: MCSTL: The multi-core standard template library. In: Kermarrec, A.-M., Bougé, L., Priol, T. (eds.) Euro-Par 2007. LNCS, vol. 4641, pp. 682–694. Springer, Heidelberg (2007)
Sitchinava, N., Zeh, N.: A parallel buffer tree. In: SPAA, pp. 214–223 (2012)
Tang, Y., Chowdhury, R.A., Kuszmaul, B.C., Luk, C.K., Leiserson, C.E.: The Pochoir stencil compiler. In: SPAA, pp. 117–128 (2011)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2013 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Ajwani, D., Sitchinava, N. (2013). Empirical Evaluation of the Parallel Distribution Sweeping Framework on Multicore Architectures. In: Bodlaender, H.L., Italiano, G.F. (eds) Algorithms – ESA 2013. ESA 2013. Lecture Notes in Computer Science, vol 8125. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-40450-4_3
Download citation
DOI: https://doi.org/10.1007/978-3-642-40450-4_3
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-40449-8
Online ISBN: 978-3-642-40450-4
eBook Packages: Computer ScienceComputer Science (R0)