Abstract
As hardware systems move toward multicore and multithreaded architectures, programmers increasingly rely on automated tools to help with both the parallelization of legacy codes and effective exploitation of all available hardware resources. Thread-level speculation (TLS) has been proposed as a technique to parallelize the execution of serial codes or serial sections of parallel codes. One of the key aspects of TLS is task selection for speculative execution.
In this paper we propose a cost model for compiler-driven task selection for TLS. The model employs profile-based analysis of may-dependences to estimate the probability of successful speculation. We discuss two techniques to eliminate potential inter-task dependences, thereby improving the rate of successful speculation. We also present a profiling tool, DProf, that is used to provide run-time information about may-dependences to the compiler and map dynamic dependences to the source code. This information is also made available to the programmer to assist in code rewriting and/or algorithm redesign.
We used DProf to quantify the potential of this approach and we present results on selected applications from the SPEC CPU2006 and SEQUOIA benchmarks.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
ASC Sequoia Benchmark Codes, http://www.llnl.gov/asc/sequoia/benchmarks/
Bruening, D., Devabhaktuni, S., Amarasinghe, S.: Softspec: Software-based speculative parallelism. In: Proceedings of 3rd ACM Workshop on Feedback-Directed and Dynamic Optimization (FDDO-3) (2000)
Burton, F.W.: Speculative computation, parallelism, and functional programming. IEEE Trans. Computers 34(12), 1190–1193 (1985)
Chen, M.K., Olukotun, K.: The Jrpm system for dynamically parallelizing Java programs. In: Proceedings of the 30th International Symposium on Computer Architecture, pp. 434–446 (2003)
Chen, P.-S., Hung, M.-Y., Hwang, Y.-S., Ju, R.D.-C., Lee, J.K.: Compiler support for speculative multithreading architecture with probabilistic points-to analysis. In: Proceedings of the 9th Symposium on Principles and Practice of Parallel Programming, pp. 25–36 (2003)
Chen, T., Dai, J.L.X., Hsu, W.-C., Yew, P.-C.: Data dependence profiling for speculative optimizations. In: Proceedings of the 13th International Conference on Compiler Construction, Barcelona, Spain, pp. 57–72 (2004)
Du, Z.-H., Lim, C.-C., Li, X.-F., Yang, C., Zhao, Q., Ngai, T.-F.: A cost-driven compilation framework for speculative parallelization of sequential programs. In: Proceedings of the SIGPLAN 2004 Conference on Programming Language Design and Implementation, Washington DC, USA, pp. 71–81 (2004)
Dubey, P., O’Brien, K., O’Brien, K., Barton, C.: Single-program speculative multithreading (SPSM) architecture. In: Proceedings of the International Conference on Parallel Architectures and Compilation Techniques (1995)
Fernández, M., Espasa, R.: Speculative alias analysis for executable code. In: Proceedings of the 11th International Conference on Parallel Architectures and Compilation Techniques, Charlottesville, VA, pp. 222–231 (2002)
Franklin, M., Sohi, G.S.: The expandable split window paradigm for exploiting fine-grain parallelsim. SIGARCH Comput. Archit. News 20(2), 58–67 (1992)
Hammond, L., Willey, M., Olukotun, K.: Data speculation support for a chip multiprocessor. In: Proceedings of 8th International Conference on Architecutral Support for Programming Languages and Operating Systems (1998)
Johnson, T., Eigenmann, R., Vijaykumar, T.: Speculative thread decomposition through empirical optimization. In: Proceedings of the 12th Symposium on Principles and Practice of Parallel Programming (2007)
Johnson, T.A., Eigenmann, R., Vijaykumar, T.N.: Min-cut program decomposition for thread-level speculation. In: Proceedings of the SIGPLAN 2004 Conference on Programming Language Design and Implementation, Washington DC, USA, pp. 59–70 (2004)
Kejariwal, A., Tian, X., Girkar, M., Li, W., Kozhukhov, S., Saito, H., Banerjee, U., Nicolau, A., Veidenbaum, A.V., Polychronopoulos, C.D.: Tight analysis of the performance potential of thread speculation using SPEC CPU 2006. In: Proceedings of the 12th Symposium on Principles and Practice of Parallel Programming (2007)
Krishnan, V., Torrellas, J.: Hardware and software support for speculative execution of sequential binaries on a chip-multiprocessor. In: Proceedings of 12th International Conference on Supercomputing (1998)
Lin, J., Chen, T., Hsu, W.-C., Yew, P.-C., Ju, R.D.-C., Ngai, T.-F., Chan, S.: A compiler framework for speculative analysis and optimizations. In: Proceedings of the SIGPLAN 2003 Conference on Programming Language Design and Implementation, pp. 289–299 (2003)
Liu, W., Tuck, J., Ceze, L., Ahn, W., Strauss, K., Renau, J., Torrellas, J.: POSH: A TLS compiler that exploits program structure. In: Proceedings of the 11th Symposium on Principles and Practice of Parallel Programming (2006)
Liu, W., Tuck, J., Ceze, L., Ahn, W., Strauss, K., Renau, J., Torrellas, J.: POSH: A TLS compiler that exploits program structure. In: Proceedings of the 11th Symposium on Principles and Practice of Parallel Programming, pp. 158–167 (2006)
Pottenger, B., Eigenmann, R.: Parallelization in the presence of generalized induction and reduction variables. In: Proceedings of 9th International Conference on Supercomputing (1995)
Pugh, W.: The definition of dependence distance. Technical Report CS-TR-2292 (November 1992)
Quiñones, C.G., Madriles, C., Sánchez, J., Marcuello, P., González, A., Tullsen, D.M.: Mitosis compiler: An infrastructure for speculative threading based on pre-computation slices. In: Proceedings of the SIGPLAN 2005 Conference on Programming Language Design and Implementation, pp. 269–279 (2005)
Halstead, J.R.H., Fujita, T.: Masa: a multithreaded processor architecture for parallel symbolic computing. SIGARCH Comput. Archit. News 16(2), 443–451 (1988)
Rauchwerger, L., Padua, D.: The lrpd test: speculative run-time parallelization of loops with privatization and reduction parallelization. In: Proceedings of the ACM SIGPLAN 1995 conference on Programming language design and implementation, pp. 218–232. ACM, New York (1995)
Renau, J., Tuck, J., Liu, W., Ceze, L., Strauss, K., Torrellas, J.: Tasking with out-of-order spawn in tls chip multiprocessors: Microarchitecture and compilation. In: Proceedings of 19th International Conference on Supercomputing (2005)
Sohi, G.S., Breach, S.E., Vijaykumar, T.N.: Multiscalar processors. In: Proceedings of International Symposium on Computer Architecture, pp. 414–425. S. Margherita Ligure, Italy (1995)
Steffan, J.G., Colohan, C.B., Zhai, A., Mowry, T.C.: A scalable approach to thread-level speculation. In: Proceedings of International Symposium on Computer Architecture (2000)
Tubella, J., González, A.: Control speculation in multithreaded processors through dynamic loop detection. In: HPCA 1998: Proceedings of the 4th International Symposium on High-Performance Computer Architecture, p. 14 (1998)
Vijaykumar, T., Sohi, G.S.: Task selection for a multiscalar processor. In: Proceedings of the 31st Annual IEEE/ACM International Symposium on Microarchitecture (1998)
von Praun, C., Bordawekar, R., Cascaval, C.: Modeling optimistic concurrency using quantitative dependence analysis. In: Proceedings of the 13th Symposium on Principles and Practice of Parallel Programming (2008)
von Praun, C., Ceze, L., Cascaval, C.: Implicit parallelism with ordered transactions. In: Proceedings of the 12th Symposium on Principles and Practice of Parallel Programming (2007)
Wu, Y., Lee, Y.: Accurate invalidation profiling for effective data speculation on EPIC processors. In: Proceddings of the 13th International Conference on Parallel and Distributed Computing Systems, Las Vegas, NV (August 2000)
Zhai, A., Colohan, C.B., Steffan, J.G., Mowry, T.C.: Compiler optimization of scalar value communication between speculative threads. In: Proceedings of 10th International Conference on Architectural Support for Programming Languages and Operating Systems (2002)
Zilles, C.B., Sohi, G.S.: A programmable co-processor for profiling. In: Proceedings of the Seventh International Symposium on High-Performance Computer Architecture, pp. 241–254. Nuevo Leon, NM (January 2001)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2008 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Wu, P., Kejariwal, A., Caşcaval, C. (2008). Compiler-Driven Dependence Profiling to Guide Program Parallelization. In: Amaral, J.N. (eds) Languages and Compilers for Parallel Computing. LCPC 2008. Lecture Notes in Computer Science, vol 5335. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-89740-8_16
Download citation
DOI: https://doi.org/10.1007/978-3-540-89740-8_16
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-89739-2
Online ISBN: 978-3-540-89740-8
eBook Packages: Computer ScienceComputer Science (R0)