Skip to main content

On the Sublinear Processor Gap for Parallel Architectures

  • Conference paper
  • 889 Accesses

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 7876))

Abstract

In the past, parallel algorithms were developed, for the most part, under the assumption that the number of processors is Θ(n) (where n is the size of the input) and that if in practice the actual number was smaller, this could be resolved using Brent’s Lemma to simulate the highly parallel solution on a lower-degree parallel architecture. In this paper, however, we argue that design and implementation issues of algorithms and architectures are significantly different—both in theory and in practice—between computational models with high and low degrees of parallelism. We report an observed gap in the behavior of a parallel architecture depending on the number of processors. This gap appears repeatedly in both empirical cases, when studying practical aspects of architecture design and program implementation as well as in theoretical instances when studying the behaviour of various parallel algorithms. It separates the performance, design and analysis of systems with a sublinear number of processors and systems with linearly many processors. More specifically we observe that systems with either logarithmically many cores or with O(n α) cores (with α < 1) exhibit a qualitatively different behavior than a system with a linear number of cores on the size of the input, i.e., Θ(n). The evidence we present suggests the existence of a sharp theoretical gap between the classes of problems that can be efficiently parallelized with o(n) processors and with Θ(n) processors unless P = NC.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Ajwani, D., Sitchinava, N., Zeh, N.: Geometric algorithms for private-cache chip multiprocessors. In: de Berg, M., Meyer, U. (eds.) ESA 2010, Part II. LNCS, vol. 6347, pp. 75–86. Springer, Heidelberg (2010)

    Chapter  Google Scholar 

  2. Ajwani, D., Sitchinava, N., Zeh, N.: I/O-optimal distribution sweeping on private-cache chip multiprocessors. In: IPDPS, pp. 1114–1123. IEEE (2011)

    Google Scholar 

  3. Arge, L., Goodrich, M.T., Nelson, M.J., Sitchinava, N.: Fundamental parallel algorithms for private-cache chip multiprocessors. In: SPAA, pp. 197–206 (2008)

    Google Scholar 

  4. Arge, L., Goodrich, M.T., Sitchinava, N.: Parallel external memory graph algorithms. In: IPDPS, pp. 1–11. IEEE (2010)

    Google Scholar 

  5. Arlazarov, V., Dinic, E., Kronrod, M., Faradzev, I.: On economic construction of the transitive closure of a directed graph. Dokl. Akad. Nauk SSSR 194, 487–488 (1970) (in Russian); English translation in Soviet Math. Dokl. 11, 1209–1210 (1975)

    MathSciNet  Google Scholar 

  6. Bender, M.A., Phillips, C.A.: Scheduling DAGs on asynchronous processors. In: SPAA, pp. 35–45. ACM (2007)

    Google Scholar 

  7. Blelloch, G.E., Chowdhury, R.A., Gibbons, P.B., Ramachandran, V., Chen, S., Kozuch, M.: Provably good multicore cache performance for divide-and-conquer algorithms. In: SODA. ACM (2008)

    Google Scholar 

  8. Blelloch, G.E., Gibbons, P.B.: Effectively sharing a cache among threads. In: SPAA, pp. 235–244. ACM (2004)

    Google Scholar 

  9. Blelloch, G.E., Gibbons, P.B., Matias, Y.: Provably efficient scheduling for languages with fine-grained parallelism. J. ACM 46, 281–321 (1999)

    Article  MathSciNet  MATH  Google Scholar 

  10. Bose, P., Chen, E.Y., He, M., Maheshwari, A., Morin, P.: Succinct geometric indexes supporting point location queries. In: SODA, pp. 635–644. SIAM (2009)

    Google Scholar 

  11. Brent, R.P.: The parallel evaluation of general arithmetic expressions. J. ACM 21(2), 201–206 (1974)

    Article  MathSciNet  MATH  Google Scholar 

  12. Burton, F.W., Sleep, M.R.: Executing functional programs on a virtual tree of processors. In: FPCA, pp. 187–194. ACM (1981)

    Google Scholar 

  13. Chowdhury, R.A., Ramachandran, V.: Cache-efficient dynamic programming algorithms for multicores. In: SPAA, pp. 207–216. ACM (2008)

    Google Scholar 

  14. Dorrigiv, R., López-Ortiz, A., Salinger, A.: Optimal speedup on a low-degree multi-core parallel architecture (LoPRAM). In: SPAA, pp. 185–187. ACM (2008)

    Google Scholar 

  15. Dymond, P.W., Tompa, M.: Speedups of deterministic machines by synchronous parallel machines. In: STOC, pp. 336–343. ACM (1983)

    Google Scholar 

  16. Feller, W.: An Introduction to Probability Theory and Its Applications, vol. 1. Wiley (1968)

    Google Scholar 

  17. Fujiwara, A., Inoue, M., Masuzawa, T.: Parallelizability of some P-complete problems. In: Rolim, J.D.P. (ed.) IPDPS 2000 Workshops. LNCS, vol. 1800, pp. 116–122. Springer, Heidelberg (2000)

    Google Scholar 

  18. Greenlaw, R., Hoover, H.J., Ruzzo, W.L.: Limits to parallel computation: P-completeness theory. Oxford University Press, Inc., New York (1995)

    MATH  Google Scholar 

  19. Hopcroft, J.E., Paul, W.J., Valiant, L.G.: On time versus space and related problems. In: FOCS, pp. 57–64. IEEE (1975)

    Google Scholar 

  20. Kruskal, C.P., Rudolph, L., Snir, M.: A complexity theory of efficient parallel algorithms. Theor. Comput. Sci. 71(1), 95–132 (1990)

    Article  MathSciNet  MATH  Google Scholar 

  21. Munro, J.I.: Tables. In: Chandru, V., Vinay, V. (eds.) FSTTCS 1996. LNCS, vol. 1180, pp. 37–42. Springer, Heidelberg (1996)

    Chapter  Google Scholar 

  22. Raab, M., Steger, A.: “Balls into Bins” - a simple and tight analysis. In: Rolim, J.D.P., Serna, M., Luby, M. (eds.) RANDOM 1998. LNCS, vol. 1518, pp. 159–170. Springer, Heidelberg (1998)

    Chapter  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2013 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

López-Ortiz, A., Salinger, A. (2013). On the Sublinear Processor Gap for Parallel Architectures. In: Chan, TH.H., Lau, L.C., Trevisan, L. (eds) Theory and Applications of Models of Computation. TAMC 2013. Lecture Notes in Computer Science, vol 7876. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-38236-9_18

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-38236-9_18

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-38235-2

  • Online ISBN: 978-3-642-38236-9

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics