New Data Structures to Handle Speculative Parallelization at Runtime

Estebanez, Alvaro; Llanos, Diego R.; Gonzalez-Escribano, Arturo

doi:10.1007/s10766-014-0347-0

New Data Structures to Handle Speculative Parallelization at Runtime

Published: 13 January 2015

Volume 44, pages 407–426, (2016)
Cite this article

International Journal of Parallel Programming Aims and scope Submit manuscript

Alvaro Estebanez¹,
Diego R. Llanos¹ &
Arturo Gonzalez-Escribano¹

267 Accesses
6 Citations
Explore all metrics

Abstract

Software-based, thread-level speculation (TLS) is a software technique that optimistically executes in parallel loops whose fully-parallel semantics can not be guaranteed at compile time. Modern TLS libraries allow to handle arbitrary data structures speculatively. This desired feature comes at the high cost of local store and/or remote recovery times: The easier the local store, the harder the remote recovery. Unfortunately, both times are on the critical path of any TLS system. In this paper we propose a solution that performs local store in constant time, while recover values in a time that is in the order of \(T\), being \(T\) the number of threads. As we will see, this solution, together with some additional improvements, makes the difference between slowdowns and noticeable speedups in the speculative parallelization of non-synthetic, pointer-based applications on a real system. Our experimental results show a gain of 3.58\(\times \) to 28\(\times \) with respect to the baseline system, and a relative efficiency of up to, on average, 65 % with respect to a TLS implementation specifically tailored to the benchmarks used.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Optimization Strategies Oriented to Loop Characteristics in Software Thread Level Speculation Systems

Article 08 January 2016

Li Shen, Fan Xu & Zhi-Ying Wang

Optimistic Parallelism on GPUs

Dynamically Spawning Speculative Threads to Improve Speculative Path Execution

References

Bryant, R.E., O’Hallaron, D.R.: Computer Systems: a programmer’s perspective, 2nd edn. Addison-Wesley Publishing Company, USA (2010)
Ceze, L., Tuck, J., Torrellas, J., Cascaval, C.: Bulk disambiguation of speculative threads in multiprocessors. In: Proceedings of the 33rd International Symposium on Computer Architecture, ISCA ’06. IEEE Computer Society, Washington, DC, USA (2006)
Cintra, M., Llanos, D.R.: Toward efficient and robust software speculative parallelization on multiprocessors. In: Proceedings of the SIGPLAN Symposium on Principles and Practice of Parallel Programming (PPoPP) (2003)
Cintra, M., Llanos, D.R.: Design space exploration of a software speculative parallelization scheme. IEEE Trans. Parallel Distrib. Syst. 16(6), 562–576 (2005)
Article Google Scholar
Cintra, M., Martínez, J.F., Torrellas, J.: Architectural support for scalable speculative parallelization in shared-memory multiprocessors. In: Proceedings of the 27th International Symposium on Computer architecture (ISCA), pp. 256–264 (2000)
Clarkson, K.L., Mehlhorn, K., Seidel, R.: Four results on randomized incremental constructions. Comput. Geom. Theory Appl. 3(4), 185–212 (1993)
Article MathSciNet MATH Google Scholar
Dai, W., An, H., Li, Q., Li, G., Deng, B., Wu, S., Li, X., Liu, Y.: A priority-aware NoC to reduce squashes in thread level speculation for chip multiprocessors. In: Proceedings of the 2011 IEEE 9th International Symposium on Parallel and Distributed Processing with Applications, ISPA ’11. IEEE Computer Society, Washington, DC, USA (2011)
Dou, J., Cintra, M.: Compiler estimation of load imbalance overhead in speculative parallelization. In: Proceedings of the 13th International Conference on Parallel Architectures and Compilation Techniques, PACT ’04. IEEE Computer Society, Washington, DC, USA (2004)
Estebanez, A., Llanos, D.R., Gonzalez-Escribano, A.: Desarrollo de un motor de paralelización especulativa con soporte para aritmética de punteros. In: Proceedings of the XXIII Jornadas de Paralelismo. Elche, Alicante, Spain (2012)
Gao, L., Li, L., Xue, J., Yew, P.C.: SEED: a statically-greedy and dynamically-adaptive approach for speculative loop execution. IEEE Trans. Comput. 62(5), 1004–1016 (2013)
Gupta, M., Nim, R.: Techniques for speculative run-time parallelization of loops. Supercomputing (1998)
Hammond, L., Hubbert, B.A., Siu, M., Prabhu, M.K., Chen, M., Olukotun, K.: The stanford hydra CMP. IEEE Micro 20(2), 71–84 (2000)
Article Google Scholar
Harris, T., Plesko, M., Shinnar, A., Tarditi, D.: Optimizing memory transactions. In: Proceedings of the 2006 ACM SIGPLAN Conference on Programming Language Design and Implementation, PLDI ’06, pp. 14–25. ACM, New York, NY, USA (2006)
Jimborean, A., Clauss, P., Dollinger, J.F., Loechner, V., Martinez Caamao, J.: Dynamic and speculative polyhedral parallelization using compiler-generated skeletons. Int. J. Parallel Program. 1–17 (2013)
Kelsey, K., Bai, T., Ding, C., Zhang, C.: Fast track: a software system for speculative program optimization. In: Proceedings of the 7th Annual IEEE/ACM International Symposium on Code Generation and Optimization, CGO ’09, pp. 157–168. IEEE Computer Society, Washington, DC, USA (2009)
Krishnan, V., Torrellas, J.: A chip-multiprocessor architecture with speculative multithreading. Comput. IEEE Trans. 48(9), 866–880 (1999)
Article Google Scholar
Kulkarni, M., Burtscher, M., Inkulu, R., Pingali, K., Casçaval, C.: How much parallelism is there in irregular applications? In: Proceedings of the 14th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, PPoPP ’09. New York, USA (2009)
Kulkarni, M., Carribault, P., Pingali, K., Ramanarayanan, G., Walter, B., Bala, K., Chew, L.P.: Scheduling strategies for optimistic parallel execution of irregular programs. In: Proceedings of the 20th Annual Symposium on Parallelism in Algorithms and Architectures, SPAA ’08, pp. 217–228. ACM, New York, NY, USA (2008)
Kulkarni, M., Nguyen, D., Prountzos, D., Sui, X., Pingali, K.: Exploiting the commutativity lattice. In: Proceedings of the 32nd ACM SIGPLAN Conference on Programming Language Design and Implementation, PLDI ’11. ACM, New York, NY, USA (2011)
Kulkarni, M., Pingali, K., Ramanarayanan, G., Walter, B., Bala, K., Chew, L.P.: Optimistic parallelism benefits from data partitioning. In: Proceedings of the 13th International Conference on Architectural Support for Programming Languages and Operating Systems, ASPLOS XIII, pp. 233–243. ACM, New York, NY, USA (2008)
Kulkarni, M., Pingali, K., Walter, B., Ramanarayanan, G., Bala, K., Chew, L.P.: Optimistic parallelism requires abstractions. In: PLDI 2007 Proceedings. ACM (2007)
Kulkarni, M., Pingali, K., Walter, B., Ramanarayanan, G., Bala, K., Chew, L.P.: Optimistic parallelism requires abstractions. Commun. ACM 52(9), 89–97 (2009)
Article Google Scholar
Lee, D., Schachter, B.: Two algorithms for constructing a delaunay triangulation. Int. J. Comput. Inf. Sci. 9(3), 219–242 (1980). doi:10.1007/BF00977785
Article MathSciNet MATH Google Scholar
Marcuello, P., Gonzalez, A., Tubella, J.: Speculative multithreaded processors. In: Proceedings of the 12th International Conference on Supercomputing, ICS ’98. ACM, New York, USA (1998)
Mehrara, M., Hao, J., Hsu, P.C., Mahlke, S.: Parallelizing sequential applications on commodity hardware using a low-cost software transactional memory. In: Proceedings of the 2009 Conference on Program Language Design and Implementation, PLDI ’09. NY, USA (2009)
Méndez-Lojo, M., Nguyen, D., Prountzos, D., Sui, X., Hassaan, M.A., Kulkarni, M., Burtscher, M., Pingali, K.: Structure-driven optimizations for amorphous data-parallel programs. In: Proceedings of the 15th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, PPoPP ’10, pp. 3–14. ACM, New York, USA (2010)
Oancea, C.E., Mycroft, A., Harris, T.: A lightweight in-place implementation for software thread-level speculation. In: Proceedings of the Twenty-First Annual Symposium on Parallelism in Algorithms and Architectures, SPAA ’09. ACM, New York, USA (2009)
Prabhu, M.K., Olukotun, K.: Using thread-level speculation to simplify manual parallelization. In: Proceedings of the Ninth ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, PPoPP ’03. ACM, New York, NY, USA (2003)
Raman, E., Vahharajani, N., Rangan, R., August, D.I.: Spice: speculative parallel iteration chunk execution. In: Proceedings of the 6th Annual IEEE/ACM International Symposium on Code Generation and Optimization, CGO ’08. ACM, New York, USA (2008)
Rauchwerger, L., Padua, D.: The LRPD test: Speculative run-time parallelization of loops with privatization and reduction parallelization. In: Proceedings of the 1995 ACM SIGPLAN Conference on Programming Language Design and Implementation, PLDI ’95, pp. 218–232. ACM, New York, NY, USA (1995)
Sankaralingam, K., Nagarajan, R., Liu, H., Kim, C., Huh, J., Burger, D., Keckler, S., Moore, C.: Exploiting ILP, TLP, and DLP with the polymorphous TRIPS architecture. In: Proceedings of the 30th Annual International Symposium on Computer Architecture, ISCA ’03 (2003)
Steffan, J.G., Colohan, C.B., Zhai, A., Mowry, T.C.: A scalable approach to thread-level speculation. In: Proceedings of the 27th Annual International Symposium on Computer architecture ISCA ’00, pp. 1–12. ACM, New York, NY, USA (2000)
Tian, C., Feng, M., Gupta, R.: Supporting speculative parallelization in the presence of dynamic data structures. In: Proceedings of the 2010 ACM SIGPLAN Conference on Programming Language Design and Implementation, PLDI ’10. ACM, New York, NY, USA (2010)
Tian, C., Feng, M., Nagarajan, V., Gupta, R.: Copy or discard execution model for speculative parallelization on multicores. In: Proceedings of the 41st Annual IEEE/ACM International Symposium on Microarchitecture, MICRO ’41. Washington, DC, USA (2008)
Tian, C., Feng, M., Nagarajan, V., Gupta, R.: Speculative parallelization of sequential loops on multicores. Int. J. Parallel Program. 37(5), 508–535 (2009)
Article MATH Google Scholar
Yiapanis, P., Rosas-Ham, D., Brown, G., Luján, M.: Optimizing software runtime systems for speculative parallelization. ACM Trans. Archit. Code Optim. 9(4), 39:1–39:27 (2013)
Zhao, Z., Wu, B., Shen, X.: Speculative parallelization needs rigor: probabilistic analysis for optimal speculation of finite-state machine applications. In: Proceedings 21st International Conference on Parallel Architectures and Compilation Techniques, PACT ’12. New York, USA (2012)

Download references

Acknowledgments

The authors would like to thank the anonymous reviewers for their helpful comments. The authors would also like to thank Mr. Sergio Aldea for his help in this work. This research is partly supported by the Castilla-Leon Regional Government (VA172A12-2); Ministerio de Industria, Spain (CENIT OCEANLIDER); MICINN (Spain) and the European Union FEDER (MOGECOPP project TIN2011-25639, CAPAP-H3 network TIN2010-12011-E, CAPAP-H4 network TIN2011-15734-E).

Author information

Authors and Affiliations

Departamento de Informatica, Universidad de Valladolid, 47011, Valladolid, Spain
Alvaro Estebanez, Diego R. Llanos & Arturo Gonzalez-Escribano

Authors

Alvaro Estebanez
View author publications
You can also search for this author in PubMed Google Scholar
Diego R. Llanos
View author publications
You can also search for this author in PubMed Google Scholar
Arturo Gonzalez-Escribano
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Alvaro Estebanez.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Estebanez, A., Llanos, D.R. & Gonzalez-Escribano, A. New Data Structures to Handle Speculative Parallelization at Runtime. Int J Parallel Prog 44, 407–426 (2016). https://doi.org/10.1007/s10766-014-0347-0

Download citation

Received: 30 July 2014
Accepted: 23 December 2014
Published: 13 January 2015
Issue Date: June 2016
DOI: https://doi.org/10.1007/s10766-014-0347-0

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

New Data Structures to Handle Speculative Parallelization at Runtime

Abstract

Access this article

Similar content being viewed by others

Optimization Strategies Oriented to Loop Characteristics in Software Thread Level Speculation Systems

Optimistic Parallelism on GPUs

Dynamically Spawning Speculative Threads to Improve Speculative Path Execution

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

New Data Structures to Handle Speculative Parallelization at Runtime

Abstract

Access this article

Similar content being viewed by others

Optimization Strategies Oriented to Loop Characteristics in Software Thread Level Speculation Systems

Optimistic Parallelism on GPUs

Dynamically Spawning Speculative Threads to Improve Speculative Path Execution

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation