Implementing and Evaluating Color-Aware Instruction Set for Low-Memory, Embedded Video Processing in Data Parallel Architectures

  • Jongmyon Kim
  • D. Scott Wills
  • Linda M. Wills
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 3824)


Future embedded imaging applications will be more demanding processing performance while requiring the same low cost and low energy consumption. This paper presents and evaluates a color-aware instruction set extension (CAX) for single instruction, multiple data (SIMD) processor arrays to meet the computational requirements and cost goals. CAX supports parallel operations on two-packed 16-bit (6:5:5) YCbCr data in a 32-bit datapath processor, providing greater concurrency and efficiency for color image and video processing. Unlike typical multimedia extensions (e.g., MMX, VIS, and MDMX), CAX harnesses parallelism within the human perceptual color space rather than depending solely on generic subword parallelism. Moreover, the ability to reduce data format size reduces system cost. The reduction in data bandwidth also simplifies system design. Experimental results on a representative SIMD array architecture show that CAX achieves a speedup ranging from 5.2× to 8.8× (an average of 6.3×) over the baseline SIMD array performance. This is in contrast to MDMX (a representative MIPS multimedia extension), which achieves a speedup ranging from 3× to 5× (an average of 3.7×) over the same baseline SIMD array. CAX also outperforms MDMX in both area efficiency (a 52% increase versus a 13% increase) and energy efficiency (a 50% increase versus an 11% increase), resulting in better component utilization and sustainable battery life. Furthermore, CAX improves the performance and efficiency with a mere 3% increase in the system area and a 5% increase in the system power, while MDMX requires a 14% increase in the system area and a 16% increase in the system power. These results demonstrate that CAX is a suitable candidate for application-specific embedded multimedia systems.


Convolution Harness 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Bhaskaran, V., Konstantinides, K.: Image and Video Compression Standards: Algorithms and Architectures. Kluwer Academic Publishers, Dordrecht (1997)Google Scholar
  2. 2.
    Cat, H.H., Gentile, A., Eble, J.C., Lee, M., Verdier, O., Joo, Y.J., Wills, D.S., Brooke, M., Jokerst, N.M., Brown, A.S., Leavitt, R.: SIMPil: An OE integrated SIMD architecture for focal plane processing applications. In: Proc. Massively Parallel Processing Using Optical Interconnection (MPPOI 1996), pp. 44–52 (1996)Google Scholar
  3. 3.
    Chai, S.M., Taha, T.M., Wills, D.S., Meindl, J.D.: Heterogeneous architecture models for interconnect-motivated system design. IEEE Trans. VLSI Systems, special issue on system level interconnect prediction 8(6), 660–670 (2000)Google Scholar
  4. 4.
    Gentile, A., Wills, D.S.: Portable Video Supercomputing. IEEE Trans. on Computers 53(8), 960–973 (2004)CrossRefGoogle Scholar
  5. 5.
    Kim, J.: Architectural enhancements for color image and video processing on embedded systems. PhD dissertation, Georgia Inst. of Technology (2005)Google Scholar
  6. 6.
    Kim, J., Wills, D.S.: Evaluating a 16-bit YCbCr (6:5:5) color representation for low memory, embedded video processing. In: Proc. of the IEEE Intl. Conf. on Consumer Electronics, pp. 181–182 (2005)Google Scholar
  7. 7.
    Kim, J., Wills, D.S.: Efficient processing of color image sequences using a color-aware instruction set on mobile systems. In: Proc. of the IEEE Intl. Conf. on Application-Specific Systems, Architectures, and Processors, pp. 137–149 (2004)Google Scholar
  8. 8.
    Nugent, S., Wills, D.S., Meindl, J.D.: A hierarchical block-based modeling methodology for SoC in GENESYS. In: Proc. of the 15th Ann. IEEE Intl. ASIC/SOC Conf., pp. 239–243 (2002)Google Scholar
  9. 9.
    Peleg, A., Weiser, U.: MMX technology extension to the Intel architecture. IEEE Micro 16(4), 42–50 (1996)CrossRefGoogle Scholar
  10. 10.
    Plataniotis, K.N., Venetsanopoulos, A.N.: Color Image Processing and Applications (2000)Google Scholar
  11. 11.
    MIPS extension for digital media with 3D. Technical Report MIPS technologies, Inc. (1997),
  12. 12.
    Slingerland, N., Smith, A.J.: Measuring the performance of multimedia instruction sets. IEEE Trans. on Computers 51(11), 1317–1332 (2002)CrossRefMathSciNetGoogle Scholar
  13. 13.
    Suh, J., Prasanna, V.K.: An efficient algorithm for out-of-core matrix transposition. IEEE Trans. on Computers 51(4), 420–438 (2002)CrossRefMathSciNetGoogle Scholar
  14. 14.
    Tremblay, M., O’Connor, J.M., Narayanan, V., He, L.: VIS speeds new media processing. IEEE Micro 16(4), 10–20 (1996)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2005

Authors and Affiliations

  • Jongmyon Kim
    • 1
  • D. Scott Wills
    • 2
  • Linda M. Wills
    • 2
  1. 1.Chip Solution CenterSamsung Advanced Institute of TechnologyKyungki-doSouth Korea
  2. 2.School of Electrical and Computer EngineeringGeorgia Institute of TechnologyAtlanta

Personalised recommendations