Advertisement

The Refutation of Amdahl’s Law and Its Variants

  • F. Dévai
Chapter
Part of the Lecture Notes in Computer Science book series (LNCS, volume 10990)

Abstract

Amdahl’s law, imposing a restriction on the speedup achievable by a multiple number of processors, based on the concept of sequential and parallelizable fractions of computations, has been used to justify, among others, asymmetric chip multiprocessor architectures and concerns of “dark silicon”. This paper demonstrates flaws in Amdahl’s law that (i) in theory no inherently sequential fractions of computations exist (ii) sequential fractions appearing in practice are different from parallelizable fractions and usually have different growth rates of time requirements and that (iii) the time requirement of sequential fractions can be proportional to the number of processors. However, mathematical analyses are also provided to demonstrate that sequential fractions have negligible effect on speedup if the growth rate of the time requirement of the parallelizable fraction is higher than that of the sequential fraction. Examples are given that Amdahl’s law and its variants fail to represent limits to parallel computation. In particular, Gustafson’s law, claimed to be a refutation of Amdahl’s law by some authors, is shown to contradict established theoretical results. We can conclude that no simple formula or law governing concurrency exists.

Keywords

Amdahl’s law Gustafson’s law Sequential and parallelizable workload Growth rates Asymmetric chip multiprocessor architectures Graphics processing units Hidden-surface removal Inherently sequential computations P-complete problems Data-parallel computing 

Notes

Acknowledgements

The author thanks anonymous reviewers for their support and constructive criticism that helped to improve the presentation of the paper. One of the reviewers drew the author’s attention to the fact that experimental results complement the author’s demonstration that parallelizable fractions of computational loads may grow with the problem size. The author also appreciate comments by Shekhar Y. Borkar, John L. Hennessy and David A. Patterson.

References

  1. 1.
    Akl, S.G., Cosnard, M., Ferreira, A.G.: Data-movement-intensive problems: two folk theorems in parallel computation revisited. Theor. Comput. Sci. 95(2), 323–337 (1992)MathSciNetCrossRefGoogle Scholar
  2. 2.
    Alexandrov, A., et al.: MapReduce and PACT—comparing data parallel programming models. In: Härder, T., Lehner, W., Mitschang, B., Schöning, H., Schwarz, H. (eds.) Datenbanksysteme für Business, Technologie und Web (BTW), 14. Fachtagung des GI-Fachbereichs “Datenbanken und Informationssysteme” (DBIS), 2.-4.3.2011 in Kaiserslautern, Germany. LNI, vol. 180, pp. 25–44. GI (2011). http://subs.emis.de/LNI/Proceedings/Proceedings180/article10.html
  3. 3.
    Alikoski, H.A.: Über das Sylvestersche Vierpunktproblem. Suomalainen Tiedeakatemia (1938)Google Scholar
  4. 4.
    Amdahl, G.M.: Validity of the single processor approach to achieving large scale computing capabilities. In: Proceedings of Spring Joint Computer Conference, pp. 483–485. ACM, New York (1967)Google Scholar
  5. 5.
    Angel, E.: Interactive Computer Graphics: A Top-Down Approach Using OpenGL, 5th edn. Addison-Wesley Co., Inc., Pearson Education, Boston (2009)Google Scholar
  6. 6.
    Annavaram, M., Grochowski, E., Shen, J.: Mitigating Amdahl’s law through EPI throttling. SIGARCH Comput. Archit. News 33(2), 298–309 (2005).  https://doi.org/10.1145/1080695.1069995CrossRefGoogle Scholar
  7. 7.
    Atallah, M.J., Callahan, P.B., Goodrich, M.T.: P-complete geometric problems. Int. J. Comput. Geom. Appl. 03(04), 443–462 (1993)MathSciNetCrossRefGoogle Scholar
  8. 8.
    Borkar, S.: Thousand core chips: a technology perspective. In: Proceedings of the 44th Annual Design Automation Conference, DAC 2007, pp. 746–749. ACM, New York (2007).  https://doi.org/10.1145/1278480.1278667
  9. 9.
    Borkar, S., Chien, A.A.: The future of microprocessors. Commun. ACM 54, 67–77 (2011).  https://doi.org/10.1145/1941487.1941507CrossRefGoogle Scholar
  10. 10.
    Borkar, S.Y.: Personal communication (2017)Google Scholar
  11. 11.
    Boyd, C.: Data-parallel computing. Queue 6(2), 30–39 (2008).  https://doi.org/10.1145/1365490.1365499CrossRefGoogle Scholar
  12. 12.
    Cai, G., Hu, W., Liu, G., Li, Q., Wang, X., Dong, W.: An effective speedup metric considering I/O constraint in large-scale parallel computer systems. In: 19th International Conference on Advanced Communication Technology (ICACT), pp. 816–822, February 2017Google Scholar
  13. 13.
    Castanho, C.D., Chen, W., Wada, K., Fujiwara, A.: Parallelizability of some P-complete geometric problems in the EREW-PRAM. In: Wang, J. (ed.) COCOON 2001. LNCS, vol. 2108, pp. 59–63. Springer, Heidelberg (2001).  https://doi.org/10.1007/3-540-44679-6_7CrossRefGoogle Scholar
  14. 14.
    Cook, S.A., Dwork, C.: Bounds on the time for parallel RAM’s to compute simple functions. In: Proceedings of the 14th Annual ACM Symposium on Theory of Computing, STOC 1982, pp. 231–233. ACM, New York (1982)Google Scholar
  15. 15.
    Cook, S.A., Reckhow, R.A.: Time bounded random access machines. J. Comput. Syst. Sci. 7(4), 354–375 (1973)MathSciNetCrossRefGoogle Scholar
  16. 16.
    Dean, J., Ghemawat, S.: MapReduce: simplified data processing on large clusters. Commun. ACM 51(1), 107–113 (2008)CrossRefGoogle Scholar
  17. 17.
    Denning, P.J., Lewis, T.G.: Exponential laws of computing growth. Commun. ACM 60(1), 54–65 (2017).  https://doi.org/10.1145/2976758CrossRefGoogle Scholar
  18. 18.
    Dévai, F.: An optimal hidden-surface algorithm and its parallelization. In: Murgante, B., Gervasi, O., Iglesias, A., Taniar, D., Apduhan, B.O. (eds.) ICCSA 2011. LNCS, vol. 6784, pp. 17–29. Springer, Heidelberg (2011).  https://doi.org/10.1007/978-3-642-21931-3_2CrossRefGoogle Scholar
  19. 19.
    Dévai, F.: Gustafson’s law contradicts theory results (Letter to the Editor). Commun. ACM 60(4), 8–9 (2017).  https://doi.org/10.1145/3056859CrossRefGoogle Scholar
  20. 20.
    Dymond, P.W., Tompa, M.: Speedups of deterministic machines by synchronous parallel machines. J. Comput. Syst. Sci. 30(2), 149–161 (1985)MathSciNetCrossRefGoogle Scholar
  21. 21.
    Ellen, F., Hendler, D., Shavit, N.: On the inherent sequentiality of concurrent objects. SIAM J. Comput. 41(3), 519–536 (2012)MathSciNetCrossRefGoogle Scholar
  22. 22.
    Esmaeilzadeh, H., Blem, E., Amant, R.S., Sankaralingam, K., Burger, D.: Power challenges may end the multicore era. Commun. ACM 56(2), 93–102 (2013).  https://doi.org/10.1145/2408776.2408797CrossRefGoogle Scholar
  23. 23.
    Eyerman, S., Eeckhout, L.: Modeling critical sections in Amdahl’s law and its implications for multicore design. SIGARCH Comput. Archit. News 38(3), 362–370 (2010).  https://doi.org/10.1145/1816038.1816011CrossRefGoogle Scholar
  24. 24.
    Fich, F.E., Meyer auf der Heide, F., Ragde, P., Wigderson, A.: One, two, three ... infinity: lower bounds for parallel computation. In: Proceedings of the 17th Annual ACM Symposium on Theory of Computing, STOC 1985, pp. 48–58. ACM, New York (1985).  https://doi.org/10.1145/22145.22151
  25. 25.
    Fich, F.E., Meyer auf der Heide, F., Wigderson, A.: Lower bounds for parallel random access machines with unbounded shared memory. Adv. Comput. Res. Parallel Distrib. Comput. 4, 1–16 (1987)Google Scholar
  26. 26.
    Fich, F.E., Hendler, D., Shavit, N.: Linear lower bounds on real-world implementations of concurrent objects. In: Proceedings of the 46th Annual IEEE Symposium on Foundations of Computer Science, FOCS 2005, pp. 165–173. IEEE Computer Society, Washington, DC (2005).  https://doi.org/10.1109/SFCS.2005.47
  27. 27.
    Forsell, M.: On the performance and cost of some PRAM models on CMP hardware. Int. J. Found. Comput. Sci. 21(3), 387–404 (2010).  https://doi.org/10.1142/S0129054110007325MathSciNetCrossRefzbMATHGoogle Scholar
  28. 28.
    Forsell, M.: A PRAM-NUMA model of computation for addressing low-TLP workloads. Int. J. Netw. Comput. 1(1), 21–35 (2011). http://www.ijnc.org/index.php/ijnc/article/view/11CrossRefGoogle Scholar
  29. 29.
    Fortune, S., Wyllie, J.: Parallelism in random access machines. In: Proceedings of the 10th Annual ACM Symposium on Theory of Computing, STOC 1978, pp. 114–118. ACM, New York (1978)Google Scholar
  30. 30.
    Fujiwara, A., Inoue, M., Masuzawa, T.: Parallelizability of some P-complete problems. In: Rolim, J. (ed.) IPDPS 2000. LNCS, vol. 1800, pp. 116–122. Springer, Heidelberg (2000).  https://doi.org/10.1007/3-540-45591-4_14CrossRefGoogle Scholar
  31. 31.
    Ghanim, F., Vishkin, U., Barua, R.: Easy PRAM-based high-performance parallel programming with ICE. IEEE Trans. Parallel Distrib. Syst. 29, 377–390 (2018)CrossRefGoogle Scholar
  32. 32.
    Greenlaw, R., Hoover, H.J., Ruzzo, W.L.: Limits to Parallel Computation: P-Completeness Theory. Oxford University Press Inc., New York (1995)zbMATHGoogle Scholar
  33. 33.
    Gustafson, J.L.: Reevaluating Amdahl’s law. Commun. ACM 31(5), 532–533 (1988)CrossRefGoogle Scholar
  34. 34.
    Hennessy, J.L., Patterson, D.A.: Computer Architecture: A Quantitative Approach, 5th edn. Morgan Kaufmann Publishers Inc., San Francisco (2011)zbMATHGoogle Scholar
  35. 35.
    Hennessy, J.L.: Personal communication (2017)Google Scholar
  36. 36.
    Herlihy, M., Shavit, N.: The Art of Multiprocessor Programming, Revised Reprint, 1st edn. Morgan Kaufmann Publishers Inc., San Francisco (2012)Google Scholar
  37. 37.
    Hill, M.D., Marty, M.R.: Amdahl’s law in the multicore era. Computer 41(7), 33–38 (2008)CrossRefGoogle Scholar
  38. 38.
    Hillis, W.D., Steele Jr., G.L.: Data parallel algorithms. Commun. ACM 29(12), 1170–1183 (1986)CrossRefGoogle Scholar
  39. 39.
    Juurlink, B., Meenderinck, C.H.: Amdahl’s law for predicting the future of multicores considered harmful. SIGARCH Comput. Archit. News 40(2), 1–9 (2012)CrossRefGoogle Scholar
  40. 40.
    Karloff, H., Suri, S., Vassilvitskii, S.: A model of computation for MapReduce. In: Proeedings of the 21st Annual ACM-SIAM Symposium on Discrete Algorithms, SODA 2010, pp. 938–948. Society for Industrial and Applied Mathematics, Philadelphia (2010). http://dl.acm.org/citation.cfm?id=1873601.1873677
  41. 41.
    Karp, A.H., Flatt, H.P.: Measuring parallel processor performance. Commun. ACM 33(5), 539–543 (1990).  https://doi.org/10.1145/78607.78614CrossRefGoogle Scholar
  42. 42.
    Karp, R.M., Ramachandran, V.: Parallel algorithms for shared-memory machines. In: Handbook of Theoretical Computer Science, vol. A, pp. 869–941. MIT Press, Cambridge (1990). http://portal.acm.org/citation.cfm?id=114872.114889CrossRefGoogle Scholar
  43. 43.
    Kucera, L.: Parallel computation and conflicts in memory access. Inf. Process. Lett. 14(2), 93–96 (1982).  https://doi.org/10.1016/0020-0190(82)90093-XMathSciNetCrossRefzbMATHGoogle Scholar
  44. 44.
    Kuck, D.J.: A survey of parallel machine organization and programming. ACM Comput. Surv. 9(1), 29–59 (1977).  https://doi.org/10.1145/356683.356686MathSciNetCrossRefzbMATHGoogle Scholar
  45. 45.
    Kumar, R., Tullsen, D.M., Jouppi, N.P., Ranganathan, P.: Heterogeneous chip multiprocessors. Computer 38(11), 32–38 (2005)CrossRefGoogle Scholar
  46. 46.
    Lamport, L.: A new solution of Dijkstra’s concurrent programming problem. Commun. ACM 17(8), 453–455 (1974).  https://doi.org/10.1145/361082.361093MathSciNetCrossRefzbMATHGoogle Scholar
  47. 47.
    Luccio, F., Pagli, L.: The p-shovelers problem: (computing with time-varying data). SIGACT News 23, 72–75 (1992)CrossRefGoogle Scholar
  48. 48.
    Luebke, D., Humphreys, G.: How GPUs work. Computer 40, 96–100 (2007). http://dl.acm.org/citation.cfm?id=1251557.1251701CrossRefGoogle Scholar
  49. 49.
    Mak, L.: Parallelism always helps. SIAM J. Comput. 26(1), 153–172 (1997)MathSciNetCrossRefGoogle Scholar
  50. 50.
    Marowka, A.: Energy-aware modeling of scaled heterogeneous systems. Int. J. Parallel Program. 45, 1–20 (2017)CrossRefGoogle Scholar
  51. 51.
    McKenna, M.: Worst-case optimal hidden-surface removal. ACM Trans. Graph. 6, 19–28 (1987)CrossRefGoogle Scholar
  52. 52.
    Mittal, S.: A survey of techniques for architecting and managing asymmetric multicore processors. ACM Comput. Surv. 48(3), 45:1–45:38 (2016).  https://doi.org/10.1145/2856125CrossRefGoogle Scholar
  53. 53.
    Morad, A., Yavits, L., Kvatinsky, S., Ginosar, R.: Resistive GP-SIMD processing-in-memory. ACM Trans. Archit. Code Optim. 12(4), 57:1–57:22 (2016).  https://doi.org/10.1145/2845084CrossRefGoogle Scholar
  54. 54.
    Nickolls, J., Buck, I., Garland, M., Skadron, K.: Scalable parallel programming with CUDA. Queue 6(2), 40–53 (2008).  https://doi.org/10.1145/1365490.1365500CrossRefGoogle Scholar
  55. 55.
    Patterson, D., Hennessy, J.: Computer Organization and Design: The Hardware/Software Interface. The Morgan Kaufmann Series in Computer Architecture and Design. Elsevier Science, \(ARM^{\textregistered }\) edn. (2016)Google Scholar
  56. 56.
    Patterson, D.A.: Personal communication (2017)Google Scholar
  57. 57.
    Patterson, D.A., Gibson, G., Katz, R.H.: A case for redundant arrays of inexpensive disks (RAID). SIGMOD Rec. 17(3), 109–116 (1988)CrossRefGoogle Scholar
  58. 58.
    Paul, J.M., Meyer, B.H.: Amdahl’s law revisited for single chip systems. Int. J. Parallel Program. 35(2), 101–123 (2007).  https://doi.org/10.1007/s10766-006-0028-8CrossRefzbMATHGoogle Scholar
  59. 59.
    Philip, J.: The area of a random triangle in a square. Technical report TRITA MAT 10 MA 01, Royal Institute of Technology (2010). http://www.math.kth.se/~johanph/squaref.pdf
  60. 60.
    Preparata, F.P.: Should Amdahl’s Law be repealed? In: Staples, J., Eades, P., Katoh, N., Moffat, A. (eds.) ISAAC 1995. LNCS, vol. 1004, pp. 311–311. Springer, Heidelberg (1995).  https://doi.org/10.1007/BFb0015436CrossRefGoogle Scholar
  61. 61.
    Reif, J.H.: Depth-first search is inherently sequential. Inf. Process. Lett. 20(5), 229–234 (1985)MathSciNetCrossRefGoogle Scholar
  62. 62.
    Roughgarden, T., Vassilvitskii, S., Wang, J.R.: Shuffles and circuits: (on lower bounds for modern parallel computation). In: Proceedings of the 28th ACM Symposium on Parallelism in Algorithms and Architectures, SPAA 2016, pp. 1–12. ACM, New York (2016).  https://doi.org/10.1145/2935764.2935799
  63. 63.
    Shavit, N.: Data structures in the multicore age. Commun. ACM 54, 76–84 (2011).  https://doi.org/10.1145/1897852.1897873CrossRefGoogle Scholar
  64. 64.
    Suleman, M.A., Mutlu, O., Qureshi, M.K., Patt, Y.N.: Accelerating critical section execution with asymmetric multi-core architectures. SIGPLAN Not. 44(3), 253–264 (2009).  https://doi.org/10.1145/1508284.1508274CrossRefGoogle Scholar
  65. 65.
    Sun, X.H., Chen, Y.: Reevaluating Amdahl’s law in the multicore era. J. Parallel Distrib. Comput. 70(2), 183–188 (2010)CrossRefGoogle Scholar
  66. 66.
    Valiant, L.G.: Parallelism in comparison problems. SIAM J. Comput. 4(3), 348–355 (1975).  https://doi.org/10.1137/0204030MathSciNetCrossRefzbMATHGoogle Scholar
  67. 67.
    Valiant, L.G.: A bridging model for parallel computation. Commun. ACM 33(8), 103–111 (1990).  https://doi.org/10.1145/79173.79181CrossRefGoogle Scholar
  68. 68.
    Vishkin, U.: A PRAM-on-chip vision (invited abstract). In: Proceedings of the Seventh International Symposium on String Processing Information Retrieval (SPIRE 2000), p. 260. IEEE Computer Society, Washington, DC (2000). http://portal.acm.org/citation.cfm?id=829519.830820
  69. 69.
    Vitter, J.S., Simons, R.A.: New classes for parallel complexity: a study of unification and other complete problems for P. IEEE Trans. Comput. 35(5), 403–418 (1986).  https://doi.org/10.1109/TC.1986.1676783CrossRefzbMATHGoogle Scholar
  70. 70.
    White, T.: Hadoop: The Definitive Guide. O’Reilly Media Inc., Sebastopol (2012)Google Scholar
  71. 71.
    Woo, D.H., Lee, H.H.: Extending Amdahl’s law for energy-efficient computing in the many-core era. Computer 41(12), 24–31 (2008)CrossRefGoogle Scholar
  72. 72.
    Yavits, L., Morad, A., Ginosar, R.: The effect of communication and synchronization on Amdahl’s law in multicore systems. Parallel Comput. 40(1), 1–16 (2014).  https://doi.org/10.1016/j.parco.2013.11.001MathSciNetCrossRefGoogle Scholar
  73. 73.
    Yavits, L., Morad, A., Ginosar, R.: The effect of temperature on Amdahl law in 3D multicore era. IEEE Trans. Comput. 65(6), 2010–2013 (2016).  https://doi.org/10.1109/TC.2015.2458865MathSciNetCrossRefGoogle Scholar

Copyright information

© Springer-Verlag GmbH Germany, part of Springer Nature 2018

Authors and Affiliations

  1. 1.London South Bank UniversityLondonUK
  2. 2.Hungarian Academy of SciencesBudapestHungary

Personalised recommendations