Skip to main content

Exploiting Multilevel Parallelism Within Modern Microprocessors: DWT as a Case Study

  • Conference paper
High Performance Computing for Computational Science - VECPAR 2004 (VECPAR 2004)

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 3402))

Abstract

Simultaneous multithreading (SMT) is being incorporated into modern superscalar microprocessors, allowing several independent threads to issue instructions to the functional units in a single cycle. Effective use of the SMT can hide the inefficiencies caused by long operation latencies, thereby yielding a better utilization of the processor’s resources. In this paper we explore techniques to efficiently exploit this capability and its interaction with short-vector processing. We put special emphasis on the differences in algorithm tuning between SMT architectures and shared memory symmetric multiprocessors. As a case study we have chosen the well known Discrete Wavelet Transform (DWT), a central-piece in some image and video coding standards such as MPEG-4 or JPEG-2000.

This work has been supported by the Spanish research grant TIC 2002-750.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Burger, D., Goodman, J.R.: Billion-Transistor Architectures: There and Back Again. IEEE Computer Magazine 37(3), 22–28 (2004)

    Google Scholar 

  2. Thakkar, S., Huff, T.: The Internet Streaming SIMD Extensions. Intel. Technology Journal Q2 (1999)

    Google Scholar 

  3. Diefendorff, K., Dubey, P., Hochsprung, R., Scales, H.: AltiVec Extension to PowerPC Accelerates Media Processing. IEEE Micro., 85–96 (April 2000)

    Google Scholar 

  4. Kalla, R., Sinharoy, B., Tendler, J.M.: IBM Power5 Chip: A Dual-Core Multithreaded Processor. IEEE Micro. 24(2), 40–47 (2004)

    Article  Google Scholar 

  5. Tullsen, D.M., Eggers, S.J., Levy, H.M.: Simultaneous Multithreading: Maximizing On-Chip Parallelism. In: Proceedings of the 22nd Annual International Symposium on Computer Architecture (ISCA 22), pp. 392–403 (1995)

    Google Scholar 

  6. Marr, T., Binns, F., Hill, D.L., Hinton, G., Koufaty, D.A., Miller, J.A., Upton, M.: Hyper-Threading Technology Architecture and Microarchitecture. Intel. Technology Journal 6(1) (2002)

    Google Scholar 

  7. Ungerer, T., Robič, B., Šilc, J.: A survey of processors with explicit multithreading. ACM Computing Surveys 35(1), 29–63 (2003)

    Article  Google Scholar 

  8. Sweldens, W.: The lifting Scheme: A construction of second generation wavelets. Technical Report 1995:6, Department of Mathematics, University of South Carolina (1995)

    Google Scholar 

  9. Taubman, D.S., Marcellin, M.W.: Jpeg2000: Image Compression Fundamentals, Standards, and Practice. In: Kluwer International Series in Engineering and Computer Science (2002)

    Google Scholar 

  10. Love, R., Corner, K.: Cpu Affinity. Linux Journal Issue 111 (July 2003) Available at, http://www.linuxjournal.com/

  11. Tian, X., Bik, A., Girkar, M., Grey, P., Saito, H., Su, E.: Intel OpenMP C++/Fortran Compiler for Hyper-Threading Technology: Implementation and Performance. Intel. Technology Journal 6(1) (2002)

    Google Scholar 

  12. Chaver, D., Tenllado, C., Piñuel, L., Prieto, M., Tirado, F.: 2-D Wavelet Transform Enhancement on General-Purpose Microprocessors: Memory Hierarchy and SIMD Parallelism Exploitation. In: Proceeding of the 2002 International Conference on High Performance Computing, Bangalore, India, December 2002, pp. 9–21 (2002)

    Google Scholar 

  13. Lafruit, G., Nachtergaele, L., Bormans, J., Engels, M., Bolsens, I.: Optimal Memory Organization for Scalable Texture Codecs in MPEG-4. IEEE Trans. on Circuits and Systems for Video Technology 9(2), 218–243 (1999)

    Article  Google Scholar 

  14. Chaver, D., Tenllado, C., Piñuel, L., Prieto, M., Tirado, F.: Vectorization of the 2D Wavelet Lifting Transform using SIMD extensions. In: Workshop on Parallel and Distributed Image Processing, Video Processing, and Multimedia (PDIVM 2003), Nize, France (April 2003)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2005 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Tenllado, C., Garcia, C., Prieto, M., Piñuel, L., Tirado, F. (2005). Exploiting Multilevel Parallelism Within Modern Microprocessors: DWT as a Case Study . In: Daydé, M., Dongarra, J., Hernández, V., Palma, J.M.L.M. (eds) High Performance Computing for Computational Science - VECPAR 2004. VECPAR 2004. Lecture Notes in Computer Science, vol 3402. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11403937_42

Download citation

  • DOI: https://doi.org/10.1007/11403937_42

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-25424-9

  • Online ISBN: 978-3-540-31854-5

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics