Abstract
To further parallelize large-scale nonlinear scientific computing applications, some data dependence techniques for nonlinear subscripts, especially for quadratic subscripts, were proposed in the past. The quadratic programming (QP) test and polynomial variable interval (PVI) test are two representative techniques. The QP test, which serves as an exact but time-consuming technique, always gives conservative results when the coefficient matrix of the quadratic terms is not positive semi-definite, while the PVI test will lose efficiency when there exist mixed polynomials in the dependence equation. Focusing on the dependences caused by quadratic subscripts in nonlinear and irregular programs, we propose an improved nonlinear data dependence test in this paper. We first normalize a quadratic equation which is written in a general form, and determine whether the canonical equation is integer solvable in the region of interest based on the interval equation theory. Experimental results show that, compared with the QP test, our method maintains a much lower time complexity. Furthermore, it can detect more general dependences than other dependence testing methods like the PVI test in terms of quadratic subscripts.
Similar content being viewed by others
References
Allen R, Kennedy K (2001) Optimizing compilers for modern architectures: a dependence-based approach. Morgan Kaufmann Publisher, San Francisco
Blume W, Eigenmann R (1998) Nonlinear and symbolic data dependence testing. IEEE Trans Parallel Distrib Syst 9(12):1180–1194
Wu J-H, Chu C-P (2007) An exact data dependence test for quadratic expressions. Inf Sci 177(23):5316–5328
Zhou J, Zeng GH (2008) A general data dependence analysis for parallelizing compilers. J Supercomput 45(2):236–252
Zhao J, Zhao RC, Han L (2012) A nonlinear array subscripts dependence test. In: Proceedings of the 14th international conference on high performance computing and communication, pp 764–771, June 2012
Zhao J, Zhao RC, Han L, Xu JL (2013) QP test: a dependence test for quadratic array subscripts. IET Softw 7(5):271–282
Kong X, Klappholz D, Psarris K (1991) The I test: an improved dependence test for automatic parallelization and vectorization. IEEE Trans Parallel Distrib Syst 2(3):342–349
Pugh W (1991) The Omega test: a fast and practical integer programming algorithm for dependence analysis. In: Proceedings of the 1991 ACM/IEEE conference on supercomputing, pp 4–13, June 1991
Wolfe MJ (1995) High performance compilers for parallel computing. Addison-Wesley Press, Redwood City
Blume W, Eigenmann R (1992) Performance analysis of parallelizing compilers on the perfect benchmarks program. IEEE Trans Parallel Distrib Syst 3(6):643–656
Dongar J, Furtney M, Reinhardt S (1991) Parallel loops: a test suite for parallel compilers: description and example results. Parallel Comput 17(10–11):1247–1255
Smith BT, Boyle JM, Dongarra JJ, Garbow BS, Ikebe Y, Klema VC, Moler CB (1976) Matrix eigensystem routines-eispack guide, 1st edn. Springer, New York
Shen ZY, Li ZY, Yew PC (1989) An empirical on array subscripts and data dependencies. In: Proceedings of the international conference on parallel processing, pp 145–152, August 1989
Allen R (1983) Dependence analysis for subscripted variables and its application to program transformations. Ph.D. thesis. Department of Mathematical Sciences, Rice University
Callahan D (1986) Dependence testing in PFC: Weak separability. Supercomputer Software Newsletter 2, Department of Computer Science, Rice University, August 1986
Li ZY, Yew PC, Zhu CQ (1990) An efficient data dependence analysis for parallelizing compilers. IEEE Trans Parallel Distrib Syst 1(1):26–34
Banerjee U, Eigenmann R, Nicolau A, Padua DA (1993) Automatic program parallelization. Proc IEEE 81(2):211–243
Knuth DE (1987) The art of computer programming. Seminumerical algorithms, vol 2, 3rd edn. Addison-Wesley, Reading
Li ZY, Yew PC, Zhu CQ (1989) Data dependence analysis on multi-dimensional array references. In: Proceedings of the 3rd international conference on supercomputing, pp 215–224, June 1989
Wolfe M, Tseng CW (1992) The Power test for data dependence. IEEE Trans Parallel Distrib Syst 3(5):591–601
Williams HP (1976) Fourier–Motzkin elimination extension to integer programming problems. J Comb Theory (A) 21(1):118–123
Goff G, Kennedy K, Tseng CW (1991) Practical dependence testing. In: Proceedings of the ACM SIGPLAN 1991 conference on programming language design and implementation, pp 15–29, June 1991
Shen ZY, Li ZY, Yew PC (1990) An empirical study of Fortran programs for parallelizing compilers. IEEE Trans Parallel Distrib Syst 11(3):356–364
Petersen P, Padua D (1996) Static and dynamic evaluation of data dependence analysis techniques. IEEE Trans Parallel Distrib Syst 7(11):1121–1132
Psarrisand K, Kyriakopoulos K (2004) An experimental evaluation of data dependence analysis techniques. IEEE Trans Parallel Distrib Syst 15(3):196–213
Maydan DE, Hennessy JL, Lam MS (1991) Efficient and exact data dependence analysis. In: Proceedings of the ACM SIGPLAN 1991 conference on programming language design and implementation, pp 1–14, June 1991
Hummel J, Hendren LJ, Nicolau A (1994) A general data dependence test for dynamic, pointer-based data structures. In: Proceedings of the ACM SIGPLAN 1994 conference on programming language design and implementation, pp 218–229, June 1994
Paek Y, Hoeflinger J, Padua DA (1998) Simplification of array access patterns for compiler optimizations. In: Proceedings of the 19th ACM SIGPLAN conference on programming language design and implementation, pp 60–71, June 1998
Paek Y, Hoeflinger J, Padua DA (2002) Efficient and precise array access analysis. ACM Trans Program Lang Syst 24(1):65–109
Hoeflinger J (2000) Interprocedural parallelization using memory classification analysis. Ph.D. thesis, University of Illinois at Urbana-Champaign, Department of Computer Science
Hoeflinger J, Paek Y (1999) The access region test. In: Proceedings of the workshop on languages and compilers for parallel computing, pp 271–285
van Engelen RA, Birch J, Shou Y, Gallivan KA (2004) A unified framework for nonlinear dependence testing and symbolic analysis. In: Proceedings of the international conference on supercomputing, pp 106–115
Yang CT, Tseng SS, Shin WC (2000) The K test: an exact and efficient knowledge-based data dependence testing method for parallelizing compilers. Proc Natl Sci Counc 24(5):362–372
Bulic P, Gustin V (2004) D-test: an extension to Banerjee test for a fast dependence analysis in a multimedia vectorizing compiler. In: Proceedings of the 18th international parallel and distributed processing symposium, pp 535–546
Chen T, Lin J, Dai XR, Hsu WC, Yew PC (2004) Data dependence profiling for speculative optimizations. In: Proceedings of the international conference on compiler construction, pp 57–72
Moseley T, Shye A, Reddi VJ, Grunwald D, Peri R (2007) Shadow profiling hiding instrumentation costs with parallelism. In: Proceedings of the IEEE/ACM international symposium on code generation and optimization, pp 198–208
Yu HT, Li ZY (2012) Fast loop-level data dependence profiling. In: Proceedings of the international conference on supercomputing, pp 37–46
Kim MJ, Kim H, Luk CK (2010) SD\(^{3}\): A scalable approach to dynamic data-dependence profiling. In: Proceedings of the IEEE/ACM international symposium on microarchitecture, pp 535–546
Vanka R, Tuck J (2012) Efficient and accurate data dependence profiling using software signatures. In: Proceedings of the IEEE/ACM international symposium on code generation and optimization, pp 168–195
Acknowledgments
We would like to acknowledge the anonymous referees for their invaluable comments and suggestions on this paper. Our work is supported by the HEGAOJI Major Project of China under Grant No. 2009ZX01036-001-001-2 and the Open Project Program of the State Key Laboratory of Mathematical Engineering and Advanced Computing No. 2013A11.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Zhao, J., Zhao, R., Chen, X. et al. An improved nonlinear data dependence test. J Supercomput 71, 340–368 (2015). https://doi.org/10.1007/s11227-014-1298-3
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11227-014-1298-3