Advertisement

International Journal of Parallel Programming

, Volume 31, Issue 2, pp 107–136 | Cite as

An Extended ANSI C for Processors with a Multimedia Extension

  • Patricio Bulić
  • Veselko Guštin
Article

Abstract

This paper presents the Multimedia C language, which is designed for the multimedia extensions included in all modern microprocessors. The paper discusses the language syntax, the implementation of its compiler and its use in developing multimedia applications. The goal was to provide programmers with the most natural way of using multimedia processing facilities in the C language. The MMC language has been used to develop some of the most frequently used multimedia kernels. The presented experiments on these scientific and multimedia applications have yielded good performance improvements.

Vector C SIMD processing ISA multimedia extensions 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

REFERENCES

  1. 1.
    A. Moshovos and G. S. Sohi, Microarchitectural Innovations: Boosting Microprocessor Performance Beyond Semiconductor Technology Scaling, Proc. IEEE 89(11):1560-1575 (November 2001).Google Scholar
  2. 2.
    I. Kuroda and T. Nishitani, Multimedia Processors, Proc. IEEE 86(6):1203-1221 ( June 1998).Google Scholar
  3. 3.
    R. Lee, Accelerating Multimedia with Enhanced Processors, IEEE Micro 15(2):22-32 (1995).Google Scholar
  4. 4.
    R. Lee and M. D. Smith, Media Processing: A New Design Target, IEEE Micro 16(4):6-9 (1996).Google Scholar
  5. 5.
    M. Mitall, A. Peleg, and U. Weiser, MMX Technology Architecture Overview, Intel Technology Journal (1997).Google Scholar
  6. 6.
    Pentium (R) II Processor Application Notes, MMX (TM) Technology C Intrinsics. http://developer.intel.com/technology/collateral/pentiumii/907/907.htm.Google Scholar
  7. 7.
    Intel Architecture Software Developer's Manual Volume 1: Basic Architecture. http://download.intel.nl/design/pentiumii/manuals/24319002.pdf.Google Scholar
  8. 8.
    Intel Architecture Software Developer's Manual Volume 2: Instruction Set Reference. http://download.intel.nl/design/pentiumii/manuals/24319102.pdf.Google Scholar
  9. 9.
    Intel Architecture Software Developer's Manual Volume 3: System Programming. http://download.intel.nl/design/pentiumii/manuals/24319202.pdf.Google Scholar
  10. 10.
    V. Lappalainen, T. D. Hamalainen, and P. Liuha, Overview of Research Efforts on Media ISA Extensions and Their Usage in Video Coding, IEEE Trans. Circuits Systems Video Tech. 12(8):660-670 (2002).Google Scholar
  11. 11.
    S. Oberman, G. Favor, and F. Weber, AMD 3DNow! Technology: Architecture and Implementation, IEEE Micro 19(2):37-48 (1999).Google Scholar
  12. 12.
    A. Peleg and U. Weiser, MMX Technology Extension to the Intel Architecture, IEEE Micro 16(4):42-50 (1996).Google Scholar
  13. 13.
    Intel C++ Compiler for Linux 6.0. http://www.intel.com/software/products/compilers/c60l/.Google Scholar
  14. 14.
    R. Allen and K. Kennedy, Automatic Translation of Fortran Programs to Vector Form, ACM Trans. Progr. Lang. Sys. 9(4):491-542 (1987).Google Scholar
  15. 15.
    D. F. Bacon, S. L. Graham, and O. J. Sharp, Compiler Transformations for High-Performance Computing, ACM Comput. Surv. 26(4):345-420 (1994).Google Scholar
  16. 16.
    U. Banerjee, R. Eigenman, A. Nicolau, and D. A. Padua, Automatic Programm Parallelization, Proc. IEEE 81(2):211-243 (1993).Google Scholar
  17. 17.
    A. J. C. Bik, M. Girkar, P. M. Grey, and X. M. Tian, Automatic Intra-Register Vectorization for the Intel Architecture, Int. J. Parallel Progr. 30(2):65-98 (2002).Google Scholar
  18. 18.
    P. Boulet, A. Darte, G. A. Silber, and V. Frederic, Loop Parallelization Algorithms: From Parallelism Extraction to Code Generation, Parallel Comput. 24:421-444 (1998).Google Scholar
  19. 19.
    P. Bulić and V. Guštin, Macro Extension for SIMD Processing, Proceedings of the 7th European Conference on Parallel Processing EURO-PAR 2001Manchester, UK, 28–31 August, 2001, Lecture Notes in Computer Science, Vol. 2150, pp. 448-451 (2001).Google Scholar
  20. 20.
    P. Bulić, The Compilation of High-Level Languages with Regard to the Instruction Set for Parallel Processing. Master Thesis, University of Ljubljana, Faculty of Computer and Information Science (2001).Google Scholar
  21. 21.
    F. Corbera, R. Asenjo, and E. Zapata, New Shape Analysis and Interprocedural Techniques for Automatic Parallelization of C Codes, Int. J. Parallel Progr. 30(1):37-63 (2002).Google Scholar
  22. 22.
    R. Gupta, S. Pande, K. Psarris, and V. Sarkar, Compilation Techniques for Parallel Systems, Parallel Comput. 25:1741-1783 (1999).Google Scholar
  23. 23.
    M. Gupta, S. Mukhopadhyay, and N. Sinha, Automatic Parallelization of Recursive Procedures, Int. J. Parallel Progr. 28(6):537-562 (2000).Google Scholar
  24. 24.
    V. Guštin and P. Bulić, Extracting SIMD Parallelism from “for” Loops, Proceedings of the 2001 ICPP Workshop on HPSECA, ICPP Conference, Valencia, Spain, 3–7 September, 2001, pp. 23-28 (2001).Google Scholar
  25. 25.
    A. Krall and S. Lelait, Compilation Techniques for Multimedia Processors, Int. J. Parallel Progr. 28(4):347-361 (2000).Google Scholar
  26. 26.
    S. Larsen and S. Amarasinghe, Exploiting Superword Level Parallelism with Multimedia Instruction Sets. Processing of the SIGPLAN'00 Conference on programming Language Design Implementation, Vancouver, B.C., June 2000. http://www.cog.lcs.mit.edu/slp/SLP-PLDI-2000.pdf.Google Scholar
  27. 27.
    K. Psarris, Program Analysis Techniques for Transforming Programs for Parallel Execution, Parallel Comput. 28(3):455-469 (2002).Google Scholar
  28. 28.
    V. Sarkar, Optimized Unrolling of Nested Loops, Int. J. Parallel Progr. 29(5):545-581 (2001).Google Scholar
  29. 29.
    N. Sreraman and R. Govindarajan, A Vectorizing Compiler for Multimedia Extensions, Int. J. Parallel Progr. 28(4):363-400 (2000).Google Scholar
  30. 30.
    J. Y. Tsai, Z. Jiang, and P. C. Yew, Compiler Techniques for the Superthreaded Architectures, Int. J. Parallel Progr. 27(1):1-19 (1999).Google Scholar
  31. 31.
    M. J. Wolfe, High Performance Compilers for Parallel Computing, Addison–Wesley (1996).Google Scholar
  32. 32.
    P. Bulić and V. Guštin, Introducing the Vector C, to appear in VECPAR2002: Seleced Papers and Invited Talks, Lecture Notes in Computer Science.Google Scholar
  33. 33.
    P. Cockshot, Vector Pascal, Department of Computer Science, University of Glasgow (September 2001).Google Scholar
  34. 34.
    R. Fisher, Compiling for SIMD Within a Register, Processings of Workshop on Languages and Compilers for Parallel Processing, North Carolina (August 1998).Google Scholar
  35. 35.
    R. Fisher, Compiling for SIMD Within a Register, Lecture Notes in Comput. Sci. 1656:290-304 (1999).Google Scholar
  36. 36.
    S. Gaissaryan and A. Lastovetsky, An ANSI C for Vector and Superscalar Computers and Its Retargetable Compiler, J. C Lang. Transl. 5(3):183-198 (1994).Google Scholar
  37. 37.
    V. Guštin and P. Bulić, Introducing the Vector C, Proceedings of the 5th International Meeting on High Performance Computing for Computational Science VECPAR 2002, Part I, Porto, Portugal, 26–28 June, 2002. pp. 253-266 (2002).Google Scholar
  38. 38.
    K. C. Li and H. Schwetman, Vector C—A Vector Processing Language, J. Parallel Distrib. Comput. 2:132-169 (1985).Google Scholar
  39. 39.
    K. C. Li, A Note on the Vector C Language, ACM SIGPLAN Notices 21(1):49-57 (1986).Google Scholar
  40. 40.
    J. R. Rose and G. L. Steele, C*: An Extended C Language for Data Parallel Programming, Proceedings of the Second International Conference on Supercomputing ICS87, May, 1987, pp. 2-16 (1987).Google Scholar
  41. 41.
    J. R. Rose and G. L. Steele, The C[] Language Specification. http://www.ispras.ru/~cbr/cbrsp.html.Google Scholar
  42. 42.
    J. R. Rose and G. L. Steele, MMX Technology Aplication Notes: Using MMX Instructions to Convert RGB to YUV Color Conversion. http://cedar.intel.com.Google Scholar
  43. 43.
    I. Ahmad, Y. He, and M. L. Liou, Video Compression With Parallel Processing, Parallel Comput. 28:1039-1078 (2002).Google Scholar
  44. 44.
    W. Amme and E. Zehender, Data Dependence Analysis in Programs with Pointers, Parallel Comput. 24:505-525 (1998).Google Scholar
  45. 45.
    W. Amme, P. Braun, F. Thomasset, and E. Zehender, Data Dependence Analysis of Assembly Code, Int. J. Parallel Progr. 28(5):431-467 (2000).Google Scholar
  46. 46.
    U. Banerjee, Dependence Analysis, Kluwer Academic Publishers, Dordrecht (1997).Google Scholar
  47. 47.
    F. Bodin, P. Beckman, D. Gannon, S. Narayana, and S. X. Yang, Distributed pC++: Basic Ideas for an Object Parallel Language, Sci. Progr. 2(3):7-22 (1993).Google Scholar
  48. 48.
    P. Y. Calland, A. Darte, Y. Robert, and F. Vivien, On the Removal of Anti-and Output-Dependences, Int. J. Parallel Progr. 26(3):285-312 (1998).Google Scholar
  49. 49.
    P. Faraboschi, G. Desoli, and J. A. Fisher, The Latest Word in Digital and Media Processing, IEEE Signal Proc. Mag. 15(2):59-85 (1998).Google Scholar
  50. 50.
    M. Ferretti and D. Rizzo, Multimedia Extensions and Sub-Word Parallelism in Image Processing: Preliminary Results, Lecture Notes in Comput. Sci. 1685:977-986 (1999).Google Scholar
  51. 51.
    A. John and J. C. Brown, Compilation of Constraint Programs with Noncyclic and Cyclic Dependences to Procedural Parallel Programs, Int. J. Parallel Progr. 26(1):65-119 (1998).Google Scholar
  52. 52.
    J. K. Lee and D. Gannon, Object Oriented Parallel Programming Experiments and Results, Processing Supercomputing 91, IEEE Computer Society Press, pp. 273-82 (1991).Google Scholar
  53. 53.
    S. Muchnick, Advanced Compiler Design and Implementation, Morgan Kaufmann Publishers (1997).Google Scholar
  54. 54.
    Z. Shen, Z. Li, and P. C. Yew, An Empirical Study of Fortran Programs for Parallelizing Compilers, IEEE Trans. Parallel Distrib. Syst. 3(1):356-364 (1992).Google Scholar
  55. 55.
    F. L. Van Scoy, Developing Software for Parallel Computing Systems, Comput. Phys. Commun. 97:36-44 (1996).Google Scholar
  56. 56.
    M. E. Wolf and M. Lam, A Loop Transformation Theory and an Algorithm to Maximize Parallelism, IEEE Trans. Parallel Distrib. Syst. 2(4):452-470 (October 1991).Google Scholar
  57. 57.
    M. E. Wolf and M. Lam, DSP Guru: Finite Impulse Response FAQ. http://www.dspguru.com/info/faqs/firfaq.htm.Google Scholar
  58. 58.
    M. E. Wolf and M. Lam, ANSI C Yacc grammar. http://www.lysator.liu.se/c/ANSI-C-grammar-y.html (1995).Google Scholar
  59. 59.
    M. E. Wolf and M. Lam, Motorola AltiVec Technology Programming Manual, http://e-www.motorola.com/brdata/PDFDB/docs/ALTIVECPIM.pdf, 1999.Google Scholar
  60. 60.
    M. E. Wolf and M. Lam, SUN VIS Instruction Set User's Manual, http://www.sun.com/processors/manuals/805-1394.pdf (2001).Google Scholar

Copyright information

© Plenum Publishing Corporation 2003

Authors and Affiliations

  • Patricio Bulić
    • 1
  • Veselko Guštin
    • 1
  1. 1.Faculty of Computer and Information ScienceUniversity of LjubljanaLjubljanaSlovenia

Personalised recommendations