Springer Nature is making SARS-CoV-2 and COVID-19 research free. View research | View latest news | Sign up for updates

Accelerating Mobile Video: A 64-Bit SIMD Architecture for Handheld Applications

  • 67 Accesses

  • 3 Citations


Providing quality mobile video applications in hand-held mobile devices requires increased computational capability. Using Single Instruction Multiple Data (SIMD) techniques to expose and accelerate the data parallelism inherent in video processing increases performance in handheld and wireless systems. The paper introduces a new 64-bit SIMD coprocessor of the Intel® XScale® microarchitecture which is optimized for low-power handheld applications. The architecture blends the SIMD media processing style with the capabilities of the XScale microarchitecture. This paper provides an overview of the architecture, its instruction set, programming model, the pipeline organization and functional units. The paper also describes how key features of architecture improve the performance of video applications as compared to a scalar implementation. The performance and power improvements based upon measured results are analyzed to show how the opportunities of power savings by reducing the frequency and voltage can be realized.

This is a preview of subscription content, log in to check access.


  1. 1.

    Michael J. Flyn, Computer Architecture: Pipelined and Par allel Processor Design, Jones & Bartlett Pub, May 10,1995.

  2. 2.

    Alex Peleg and Uri Weiser, “MMX Technology Extension to Intel Architecture,” IEEE Micro, vol. 16, no. 4, 1996, pp. 42–50.

  3. 3.

    Keith Diendroff, “Pentium III = Pentium II+ SSE,” Micro Processors Report, vol. 13, no. 3, 1999, pp. 6–11.

  4. 4.

    Ruby, Lee, “Subword Parallelism in MAX-2,” IEEE Micro, vol. 16, no. 4, 1996, 51–59.

  5. 5.

    Tremblay, Marc, et al., “VIS Speeds Media Processing,” IEEE Micro, vol. 16, no. 4, 1996, pp. 10–20.

  6. 6.

    D. Brash, “ARM-V6 Architecture White Paper” http://www.arm.com/support/White_Papers, January 2002.

  7. 7.

    Uri. Weiser, et al., The Complete Guide to MMXTM Technology, Mcgraw-Hill, 1997, ISBN 0-07-006192-0.

  8. 8.


  9. 9.

    “Intel® XScale® Microarchitecture for the PXA255 Processor User Manual,” http://www.intel.com/design/pca/applicationsprocessors/manuals/278796.htm

  10. 10.

    J.L. Hennesy and D.A. Patterson, Computer Architecture: A Quantitative Approach, 2nd ed., San Francisco, California: Morgan Kaufmann, 1995.

  11. 11.

    David Seal, Advanced RISC Machines Architecture Reference Manual, Prentice Hall, 1996. ISBN 0-201-73719-1.

  12. 12.

    S.B. Furber, ARM System-on-Chip Architecture, Addison Wesley, 2000. ISBN 0-201-67519-6.

  13. 13.


  14. 14.

    “MPEG4 Overview (V.21),” Edited by Rob Konen, ISO/IEC JTC1/SC29/WG11 N4668.

  15. 15.

    International Organization for Standardization, “ISO/IEC JTC1/SC29/WG11N1902 14496-2 Committee Draft (MPEG-4).” November 1997.

  16. 16.

    IUT-T Recommendation H.263: “Video Coding for Low Bitrate Communication,” Geneve, 1996.

  17. 17.

    “Draft ITU-T Recommendation and Final Draft International Standard of Joint Video Specification (ITU-T Rec. H.264 ISO/IEC 14496-10 AVC),” T. Wiegand and G Sullivan (ed.), March, 2003.

  18. 18.

    Peter Kuhn, Algorithms, Complexity Analysis and VLSI Architectures for MPEG-4 Motion Estimation, Kluwer Academic Press, ISBN 0-7923-8516-0.

  19. 19.

    “Integrated Performance Primitive,” http://intel.com/software/products/ipp/

  20. 20.

    Benini et al., “A survey of Design Techniques for System-Level Dynamic Power Management,” IEEE Transactions on Very Large Scale Integration Systems (VLSI), vol. 8, no. 3, 2000.

  21. 21.

    Frank Bellosa, “Bibliography on Power Management,” http://www4.informatik.uni-erlangen.de/Projects/PowerManagement/Bibliography/

  22. 22.

    Pouwelse et al., “Power-Aware Video Decoding,” 2001 Picture Coding Symposium, April 2001.

Download references

Author information

Correspondence to N. C. Paver.

Additional information

Nigel C. Paver has 13 years experience with the ARM architecture, and in the Intel PCA Components group in Austin, Texas, he is responsible for the architecture and implementation of multimedia coprocessors for the Intel XScale micro-architecture. He is also involved in product architecture and definition of Intel PCA processors. Before Intel, Nigel was one of the lead designers of the early AMULET asynchronous ARM microprocessors at the University of Manchester. He was also vice president in a startup company which used asynchronous design techniques to produce a low-power asynchronous DSP core. Nigel holds a Master of Science degree and Ph.D. in computer science from the University of Manchester and a Bachelor of Science degree in electronics from UMIST.

Moinul Khan is a multimedia product architect at Intel Corporation PCA Components group. He is responsible PCA graphics and security architecture. His research interests are virtual prototyping, signal processing algorithms and architecture and communications networking. Before joining Intel he was a technology specialist and founding member of a startup at ATDC, Georgia. He worked on his doctoral research at Georgia Center for Advanced Telecommunications Technology at Georgia Institute of Technology. He received his B.Tech form Indian Insti-ture of Technology and MSEE from Georgia Tech. He also worked as a research member for Canadian Institute for Telecommunications Research and Bell Communications Laboratories.

Bradley C. Aldrich joined Intel in 1997 where he is currently an architect within the PCA Components Group. His current work includes the development of coprocessor instruction support in addition to image capture and display technologies for XScale based application processors. He was previously a member of the Intel/Analog Devices joint development architecture team responsible for video enhancements for the Micro Signal Architecture. Prior to that he was a video system architect in Intel’s Digital Imaging and Video Division working on CMOS sensors, still cameras, and tethered PC based video peripherals. He has also worked as a device engineer for Motorola and as a test engineer for Tektronix. He received a BSEE in 1988 and MSEE in 1994 from the University of Texas at San Antonio.

Christopher D. Emmons received a Bachelor of Science degree in Computer Science from the University of Texas at Austin in 2003. He joined Intel in 2001 and is currently a multimedia architect responsible for algorithm development and performance optimization for handheld products within the PCA Components Group. Prior to this he worked as an applications engineer providing performance and power analysis in support of product marketing groups. His research interests include video compression, operating system design, and dynamic resource management.

Rights and permissions

Reprints and Permissions

About this article

Cite this article

Paver, N.C., Khan, M.H., Aldrich, B.C. et al. Accelerating Mobile Video: A 64-Bit SIMD Architecture for Handheld Applications. J VLSI Sign Process Syst Sign Image Video Technol 41, 21–34 (2005). https://doi.org/10.1007/s11265-005-6248-0

Download citation


  • wireless video
  • SIMD
  • multi-media
  • architecture
  • SOC