Abstract
We introduce an extensible knowledge based tool for idiom (pattern) recognition in DSP (digital signal processing) programs. Our tool utilizes functionality provided by the Cetus compiler infrastructure for detecting certain computation patterns that frequently occur in DSP code. We focus on recognizing patterns for for-loops and statements in their bodies as these often are the performance critical constructs in DSP applications for which replacement by highly optimized, target-specific parallel algorithms will be most profitable. For better structuring and efficiency of pattern recognition, we classify patterns by different levels of complexity such that patterns in higher levels are defined in terms of lower level patterns. The tool works statically on the intermediate representation. For better extensibility and abstraction, most of the structural part of recognition rules is specified in XML form to separate the tool implementation from the pattern specifications. Information about detected patterns will later be used for optimized code generation by local algorithm replacement e.g. for the low-power high-throughput multicore DSP architecture ePUMA.
Similar content being viewed by others
Notes
Certain transformational techniques such as loop distribution are applied in order to enhance the recognition process in the presence of multiple statements in a for-loop body that together do not match any defined single pattern. Loop distribution factors out statements of the loop body with no cyclic data dependency into separate loops, which enhances the recognition process. The details of the whole process can be found in [24].
Although arrays are considered as variables in most of the programming languages, Cetus handles them in a different manner by defining a specific type (ArrayAccess) for them.
The ADDMULTIMUL pattern supports an arbitrary number of factors.
The non-structural part of recognition rules will be handled by separate auxiliary matching functions called by reflection, which will be explained in Sect. 4.4.
Generally, such merging of siblings is only possible if interferences by other read or write accesses by other siblings “in between” can be statically excluded; see [13] for details.
References
Arenaz, M., Touriño, J., Doallo, R.: Xark: an extensible framework for automatic recognition of computational kernels. ACM Trans. Program. Lang. Syst. 30, 32:1–32:56 (2008)
Bacry, E.: Lastwave (software). http://www.cmap.polytechnique.fr/~bacry/LastWave/index.html (1997–2009)
Blume, W., Eigenmann, R., Faigin, K., Grout, J., Hoeflinger, J., Padua, D., Petersen, P., Pottenger, W., Rauchwerger, L., Tu, P., Stephen, W.: Polaris: improving the effectiveness of parallelizing compilers. In: Proceedings of the Seventh Workshop on Languages and Compilers for Parallel Computing, pp. 141–154. Springer (1994)
Borsboom, E.: Vocoder (software). http://www.epiphyte.ca/proj/vocoder/, 1995–2011
Chapman, B., Mehrotra, P., Zima, H.: Programming in Vienna Fortran (1992)
de Castro, E.: Secret Rabbit Code (aka libsamplerate) (software). http://www.mega-nerd.com/SRC/index.html (2005)
Di Martino, B., Kessler, C.W.: Two program comprehension tools for automatic parallelization. IEEE Concurr. 8(1), 37–47 (2000)
Franchetti, F., de Mesmay, F., McFarlin, D., Püschel, M.: Operator language: a program generation framework for fast kernels. In: IFIP Working Conference on Domain Specific Languages (DSL WC), volume 5658 of Lecture Notes in Computer Science, pp. 385–410. Springer (2009)
Hansson, E., Sohl, J., Kessler, C., Liu, D.: Case study of efficient parallel memory access programming for the embedded heterogeneous multicore DSP architecture ePUMA. In: Proceedings of International Workshop on Multi-Core Computing Systems (MuCoCoS-2011), June 2011, IEEE CS Press, Seoul (2011)
Hind, M.: Pointer analysis: haven’t we solved this problem yet? In: Proceedings of the 2001 ACM SIGPLAN-SIGSOFT Workshop on Program Analysis for Software Tools and Engineering, PASTE ’01, pp 54–61. ACM, New York (2001)
Horwitz, S.: Precise flow-insensitive may-alias analysis is NP-hard. ACM Trans. Program. Lang. Syst. 19, 1–6 (1997)
Johnson, T.A., Lee, S.I., Fei, L., Basumallik, A., Eigenmann, R., Midkiff, S.P.: Experiences in using Cetus for source-to-source transformations. In: Proceedings of 17th International Workshop on Languages and Compilers for Parallel Computing (LCPC) (2004)
Kessler, C.W.: Pattern-driven automatic parallelization. Scientif Program. 5(3), 251–274 (1996)
Kuck, D.J., Kuhn, R.H., Padua, D.A., Leasure, B., Wolfe, M.: Dependence graphs and compiler optimizations. In: Proceedings of the 8th ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages, POPL ’81, pp. 207–218. ACM, New York (1981)
Landi, W.: Undecidability of static analysis. ACM Lett. Program. Lang. Syst. 1, 323–337 (1992)
Martino, B., Di, Iannello, G.: Pap recognizer: a tool for automatic recognition of parallelizable patterns. In: Proceedings of 4th International Workshop on Program Comprehension. IEEE Computer Society, Los Alamitos (1996)
Metzger, R., Wen, Z.: Automatic algorithm recognition and replacement: a new approach to program optimization. MIT Press, Cambridge (2000)
Parr, T.J., Quong, R.W.: Antlr: a predicated-LL(k) parser generator. Softw. Pract. Exper. 25(7), 789–810 (1995)
Peters, J.: Fiview: a digital filter design viewing and comparison tool (software). http://uazu.net/fiview/ (1997–2007)
Phillips, D.: Image processing in C: analyzing and enhancing digital images. R& D Publications, Inc., Lawrence (1994)
Pottenger, B., Eigenmann, R.: Idiom recognition in the Polaris parallelizing compiler. In: Proceedings of 9th International Conference on Supercomputing, pp. 444–448. ACM (1995)
Püschel, M., Moura, J.M.F., Johnson, J.R., Padua, D., Veloso, M.M., Singer, B.W., Xiong, J., Franchetti, F., Gacic, A., Voronenko, Y., Chen, K., Johnson, R.W., Rizzolo, N.: Spiral: code generation for DSP transforms. Proc. IEEE 93 (2) (2005)
Sbragion, D.: DRC: digital room correction (software). http://drc-fir.sourceforge.net/ (2002–2006)
Sarvestani, A.S.: Automated Recognition of Algorithmic Patterns in DSP Programs. Master’s thesis, Linköping University, Department of Computer and Information Science (2011) LIU-IDA/LITH-EX-11/052-SE. http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-73934
Torger, A.: BruteFIR (software). http://www.ludd.luth.se/~torger/brutefir.html (2001–2006, 2009)
Torger, A.: AlmusVCU (software). http://www.ludd.luth.se/~torger/almusvcu.html (2006)
Vinod, U.V., Baruah, P.K.: Mpiimgen—a code transformer that parallelizes image processing codes to run on a cluster of workstations. In: IEEE International Conference on Cluster Computing, pp. 5–12 (2004)
Acknowledgments
This project was supported by SSF and SeRC. We would also like to thank the anonymous reviewers for their constructive comments.
Author information
Authors and Affiliations
Corresponding author
Additional information
This work was supported by SSF and SeRC.
Rights and permissions
About this article
Cite this article
Shafiee Sarvestani, A., Hansson, E. & Kessler, C. Extensible Recognition of Algorithmic Patterns in DSP Programs for Automatic Parallelization. Int J Parallel Prog 41, 806–824 (2013). https://doi.org/10.1007/s10766-012-0229-2
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10766-012-0229-2