Skip to main content
Log in

Software Speculation on Caching DSMs

  • Published:
International Journal of Parallel Programming Aims and scope Submit manuscript

Abstract

Clusters with caching DSMs deliver programmability and performance by supporting shared-memory programming model and tolerating communication latency of remote fetches via caching. The input of a data parallel program is partitioned across machines in the cluster while the DSM transparently fetches and caches remote data as needed by the application. Irregular applications are challenging to parallelize because the input related data dependences that manifest at runtime require the use of speculation for efficiently exploiting parallelism. By speculating that there are no cross iteration dependences, multiple iterations of a data parallel loop are executed in parallel using locally cached copies of data; the absence of dependences is validated before committing the speculatively computed results. In this paper we show that in irregular data-parallel applications, while caching helps tolerate long communication latencies, using a value read from the cache in a computation can lead to misspeculation, and thus aggressive caching can degrade performance due to increased misspeculation rate. To limit misspeculation rate we present optimizations for distributed speculation on caching based DSMs that decrease the cost of misspeculation check and speed up the re-execution of misspeculated recomputations. These optimizations give speedups of 2.24\({\times }\) for graph coloring, 1.71\({\times }\) for connected components, 1.88\({\times }\) for community detection, 1.32\({\times }\) for shortest path, and 1.74\({\times }\) for pagerank over baseline parallel executions.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4

Similar content being viewed by others

References

  1. Bal, H.E., Bhoedjang, R., Hofman, R., Jacobs, C., Langendoen, K., Rühl, T., Kaashoek, M.F.: Performance evaluation of the Orca shared-object system. ACM Trans. Comput. Syst. (TOCS) 16(1), 1–40 (1998)

    Article  Google Scholar 

  2. Dash, A., Demsky, B.: Integrating caching and prefetching mechanisms in a distributed transactional memory. IEEE Trans. Parallel Distrib. Syst. 22(8), 1284–1298 (2011)

    Article  Google Scholar 

  3. Davis, T.A., Hu, Y.: The University of Florida sparse matrix collection. ACM Trans. Math. Softw. (TOMS) 38(1), 1 (2011)

    MathSciNet  MATH  Google Scholar 

  4. Ding, C., Shen, X., Kelsey, K., Tice, C., Huang, R., Zhang, C.: Software behavior oriented parallelization. In: ACM SIGPlan Notices, vol. 42, pp. 223–234. ACM, New York, NY, USA (2007)

  5. Feng, M., Gupta, R., Hu, Y.: SpiceC: scalable parallelism via implicit copying and explicit commit. In: ACM SIGPlan Notices, vol. 46, pp. 69–80. ACM, New York, NY, USA (2011)

  6. Fernandes, J.: Speculative execution on distributed and replicated software transactional memory systems URL http://www.gsd.inesc-id.pt/~ler/reports/joaofernandesea.pdf

  7. Fu, H., Zhu, Y., Yu, W.: A case study of mapreduce speculation for failure recovery. In: Proceedings of the 2015 International Workshop on Data-Intensive Scalable Computing Systems, p. 7. ACM (2015)

  8. Jacobs, B., Bai, T., Ding, C.: The first workshop on advances in message passing. (co-located with PLDI 2010), Toronto, Canada, June 6, 2010 http://www.cs.rochester.edu/u/cding/amp/papers/pos/Distributed%20Speculative%20Program%20Parallelization.pdf (2010)

  9. Kistler, M., Alvisi, L.: Improving the performance of software distributed shared memory with speculation. IEEE Trans. Parallel Distrib. Syst. 16(9), 885–896 (2005)

    Article  Google Scholar 

  10. Kulkarni, M., Burtscher, M., Inkulu, R., Pingali, K., Casçaval, C.: How much parallelism is there in irregular applications? In: ACM SIGPlan Notices, vol. 44, pp. 3–14. ACM, New York, NY, USA (2009)

  11. Kunegis, J.: KONECT—The Koblenz Network Collection. In: Proceedings of International Conference on World Wide Web Companion, pp. 1343–1350 (2013). URL http://userpages.uni-koblenz.de/~kunegis/paper/kunegis-koblenz-network-collection.pdf

  12. Leskovec, J., Krevl, A.: SNAP Datasets: Stanford Large Network Dataset Collection. http://snap.stanford.edu/data (2014)

  13. Low, Y., Gonzalez, J.E., Kyrola, A., Bickson, D., Guestrin, C.E., Hellerstein, J.: Graphlab: a new framework for parallel machine learning. Preprint arXiv:1408.2041 (2014)

  14. Oancea, C.E., Selby, J.W., Giesbrecht, M., Watt, S.M.: Distributed models of thread level speculation. In: The 2005 International Conference on Parallel and Distributed Processing Techniques and Applications, PDPTA ‘05, vol. 5, pp. 920–927 (2005)

  15. Palmieri, R., Quaglia, F., Romano, P.: Osare: Opportunistic speculation in actively replicated transactional systems. In: 30th IEEE Symposium on Reliable Distributed Systems (SRDS), pp. 59–64. IEEE (2011)

  16. Raghavachari, M., Rogers, A.: Understanding language support for irregular parallelism. In: Ito, T., Halstead, R., Queinnec, C. (eds.) Parallel Symbolic Languages and Systems: International Workshop PSLS’95 Beaune, France, October 2–4, 1995 Proceedings, pp. 174–189. Springer, Berlin, Heidelberg (1996)

  17. Rauchwerger, L., Amato, N.M., Padua, D.A.: Run-time methods for parallelizing partially parallel loops. In: Proceedings of the 9th international conference on Supercomputing, pp. 137–146. ACM (1995)

  18. Rauchwerger, L., Padua, D.A.: The LRPD test: speculative run-time parallelization of loops with privatization and reduction parallelization. IEEE Trans. Parallel Distrib. Syst. 10(2), 160–180 (1999)

    Article  Google Scholar 

  19. Ravichandran, K., Pande, S.: Multiverse: efficiently supporting distributed high-level speculation. ACM SIGPlan Not. 48(10), 533–552 (2013)

    Article  Google Scholar 

  20. Rundberg, P., Stenström, P.: An all-software thread-level data dependence speculation system for multiprocessors. J. Instr. Level Parallelism 3(1), 2002 (2001)

    Google Scholar 

  21. Scales, D.J., Gharachorloo, K.: Shasta: a system for supporting fine-grain shared memory across clusters. In: Eighth SIAM Conference on Parallel Processing for Scientific Computing, PPSC ‘97. Citeseer (1997)

  22. Ţăpuş, C., Hickey, J.: Speculations: providing fault-tolerance and recoverability in distributed environments. In: Proceedings of the Second Conference on Hot Topics in System Dependability, pp. 10–10. USENIX Association (2006)

  23. Vora, K., Koduru, S.C., Gupta, R.: ASPIRE: Exploiting Asynchronous Parallelism in Iterative Algorithms using a Relaxed Consistency Based DSM. In: Proceedings of the 2014 ACM International Conference on Object Oriented Programming Systems Languages and Applications, pp. 861–878. ACM (2014)

  24. Yahoo Developer Network: Hadoop tutorial—mapreduce fault tolerance. https://developer.yahoo.com/hadoop/tutorial/module4.html

  25. Zilles, C., Sohi, G.: Master/slave speculative parallelization. In: Proceedings of the 35th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO-35), pp. 85–96. IEEE (2002)

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Sai Charan Koduru.

Additional information

This work was supported in part by NSF Grants CCF-1318103 and CCF-1524852.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Koduru, S.C., Vora, K. & Gupta, R. Software Speculation on Caching DSMs. Int J Parallel Prog 46, 313–332 (2018). https://doi.org/10.1007/s10766-017-0499-9

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10766-017-0499-9

Keywords

Navigation