Advertisement

Processor Allocation for Optimistic Parallelization of Irregular Programs

  • Francesco Versaci
  • Keshav Pingali
Part of the Lecture Notes in Computer Science book series (LNCS, volume 7333)

Abstract

Optimistic parallelization is a promising approach for the parallelization of irregular algorithms: potentially interfering tasks are launched dynamically, and the runtime system detects conflicts between concurrent activities, aborting and rolling back conflicting tasks. However, parallelism in irregular algorithms is very complex. In a regular algorithm like dense matrix multiplication, the amount of parallelism can usually be expressed as a function of the problem size, so it is reasonably straightforward to determine how many processors should be allocated to execute a regular algorithm of a certain size (this is called the processor allocation problem). In contrast, parallelism in irregular algorithms can be a function of input parameters, and the amount of parallelism can vary dramatically during the execution of the irregular algorithm. Therefore, the processor allocation problem for irregular algorithms is very difficult.

In this paper, we describe the first systematic strategy for addressing this problem. Our approach is based on a construct called the conflict graph, which (i) provides insight into the amount of parallelism that can be extracted from an irregular algorithm, and (ii) can be used to address the processor allocation problem for irregular algorithms. We show that this problem is related to a generalization of the unfriendly seating problem and, by extending Turán’s theorem, we obtain a worst-case class of problems for optimistic parallelization, which we use to derive a lower bound on the exploitable parallelism. Finally, using some theoretically derived properties and some experimental facts, we design a quick and stable control strategy for solving the processor allocation problem heuristically.

Keywords

Irregular algorithms Optimistic parallelization Automatic parallelization Amorphous data-parallelism Processor allocation Unfriendly seating Turán’s theorem 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Agrawal, K., Leiserson, C.E., He, Y., Hsu, W.J.: Adaptive work-stealing with parallelism feedback. ACM Trans. Comput. Syst. 26(3), 7:1–7:32 (2008), http://doi.acm.org/10.1145/1394441.1394443 Google Scholar
  2. 2.
    Alon, N., Spencer, J.: The probabilistic method. Wiley-Interscience (2000)Google Scholar
  3. 3.
    An, P., Jula, A., Rus, S., Saunders, S., Smith, T.G., Tanase, G., Thomas, N., Amato, N.M., Rauchwerger, L.: Stapl: An Adaptive, Generic Parallel C++ Library. In: Dietz, H.G. (ed.) LCPC 2001. LNCS, vol. 2624, pp. 193–208. Springer, Heidelberg (2003)CrossRefGoogle Scholar
  4. 4.
    Blackford, L.S., Choi, J., Cleary, A., D’Azevedo, E., Demmel, J., Dhillon, I., Dongarra, J., Hammarling, S., Henry, G., Petitet, A., Stanley, K., Walker, D., Whaley, R.C.: ScaLAPACK Users’ Guide. Society for Industrial and Applied Mathematics, Philadelphia (1997)zbMATHCrossRefGoogle Scholar
  5. 5.
    Braunstein, A., Mézard, M., Zecchina, R.: Survey propagation: An algorithm for satisfiability. Random Struct. Algorithms 27(2), 201–226 (2005)zbMATHCrossRefGoogle Scholar
  6. 6.
    Eppstein, D.: Spanning trees and spanners. In: Sack, J., Urrutia, J. (eds.) Handbook of Computational Geometry, pp. 425–461. Elsevier (2000)Google Scholar
  7. 7.
    Freedman, D., Shepp, L.: Problem 62-3, an unfriendly seating arrangement. SIAM Review 4(2), 150 (1962), http://www.jstor.org/stable/2028372 CrossRefGoogle Scholar
  8. 8.
    Friedman, H.D., Rothman, D., MacKenzie, J.K.: Problem 62-3. SIAM Review 6(2), 180–182 (1964), http://www.jstor.org/stable/2028090 CrossRefGoogle Scholar
  9. 9.
    Frigo, M., Johnson, S.G.: The design and implementation of FFTW3. Proceedings of the IEEE 93(2), 216–231 (2005); special issue on Program Generation, Optimization, and Platform AdaptationCrossRefGoogle Scholar
  10. 10.
    Frigo, M., Leiserson, C.E., Randall, K.H.: The implementation of the Cilk-5 multithreaded language. In: PLDI, pp. 212–223 (1998)Google Scholar
  11. 11.
    Georgiou, K., Kranakis, E., Krizanc, D.: Random maximal independent sets and the unfriendly theater seating arrangement problem. Discrete Mathematics 309(16), 5120–5129 (2009), http://www.sciencedirect.com/science/article/B6V00-4W55T4X-2/2/72d38a668c737e68edf497512e606e12 MathSciNetzbMATHCrossRefGoogle Scholar
  12. 12.
    Guibas, L.J., Knuth, D.E., Sharir, M.: Randomized incremental construction of delaunay and voronoi diagrams. Algorithmica 7(4), 381–413 (1992)MathSciNetzbMATHCrossRefGoogle Scholar
  13. 13.
    Jensen, J.: Sur les fonctions convexes et les inégalités entre les valeurs moyennes. Acta Mathematica 30(1), 175–193 (1906)MathSciNetzbMATHCrossRefGoogle Scholar
  14. 14.
    Kalé, L.V., Krishnan, S.: Charm++: A portable concurrent object oriented system based on C++. In: OOPSLA, pp. 91–108 (1993)Google Scholar
  15. 15.
    Kulkarni, M., Burtscher, M., Cascaval, C., Pingali, K.: Lonestar: A suite of parallel irregular programs. In: ISPASS, pp. 65–76. IEEE (2009)Google Scholar
  16. 16.
    Kulkarni, M., Burtscher, M., Inkulu, R., Pingali, K., Cascaval, C.: How much parallelism is there in irregular applications? In: Reed, D.A., Sarkar, V. (eds.) PPOPP, pp. 3–14. ACM (2009)Google Scholar
  17. 17.
    Méndez-Lojo, M., Nguyen, D., Prountzos, D., Sui, X., Hassaan, M.A., Kulkarni, M., Burtscher, M., Pingali, K.: Structure-driven optimizations for amorphous data-parallel programs. In: Govindarajan, R., Padua, D.A., Hall, M.W. (eds.) PPOPP, pp. 3–14. ACM (2010)Google Scholar
  18. 18.
    Pingali, K., Nguyen, D., Kulkarni, M., Burtscher, M., Hassaan, M.A., Kaleem, R., Lee, T.H., Lenharth, A., Manevich, R., Méndez-Lojo, M., Prountzos, D., Sui, X.: The tao of parallelism in algorithms. In: Proceedings of the 32nd ACM SIGPLAN Conference on Programming Language Design and Implementation, PLDI 2011, pp. 12–25. ACM, New York (2011), http://doi.acm.org/10.1145/1993498.1993501 Google Scholar
  19. 19.
    Püschel, M., Moura, J., Johnson, J., Padua, D., Veloso, M., Singer, B., Xiong, J., Franchetti, F., Gacic, A., Voronenko, Y., Chen, K., Johnson, R., Rizzolo, N.: Spiral: Code generation for dsp transforms. Proceedings of the IEEE 93(2), 232–275 (2005)CrossRefGoogle Scholar
  20. 20.
    Reinders, J.: Intel threading building blocks. O’Reilly & Associates, Inc., Sebastopol (2007)Google Scholar
  21. 21.
    Tan, P.N., Steinbach, M., Kumar, V.: Introduction to Data Mining. Addison-Wesley (2005)Google Scholar
  22. 22.
    Tao, T.: Additive combinatorics. Cambridge University Press (2006)Google Scholar
  23. 23.
    Versaci, F., Pingali, K.: Brief announcement: processor allocation for optimistic parallelization of irregular programs. In: Proceedings of the 23rd ACM Symposium on Parallelism in Algorithms and Architectures, SPAA 2011, pp. 261–262. ACM, New York (2011), http://doi.acm.org/10.1145/1989493.1989533 Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2012

Authors and Affiliations

  • Francesco Versaci
    • 1
    • 2
  • Keshav Pingali
    • 3
  1. 1.University of PadovaItaly
  2. 2.Technische Universität WienAustria
  3. 3.University of Texas at AustinUSA

Personalised recommendations