Compression Algorithms for Diffuse Data

  • Roy Hoffman


In this chapter, the introduction to data compression algorithms continues with discussions of those algorithms selected as the official standards (or, in some cases, de facto standards) for compressing diffuse data — speech, audio, image, and video. Marketplace forces tremendously influence which compression algorithms become standards, and this is most evident for diffuse data compression standards. The algorithms described in this chapter were selected as standards — first, because they can provide good quality compressed speech, audio, image, and video at reasonable data rates. Second, they can be economically implemented in VLSI hardware (or software, sometimes). Third, they can deliver data in real time — a requirement for speech, audio, and video applications. However, diffuse data compression — particularly video compression — is still new, with a plethora of compression techniques to pick from and still more — and likely better ones — waiting to be discovered. This, coupled with the continuing discovery of new applications for diffuse data compression, means the process of creating standards is hardly finished. In fact, it may have just begun. Throughout the chapter, we will be describing the evolution leading up to today’s standards, pointing out those standards in decline and areas where new standards are likely to emerge. As with our discussion of symbolic data compression algorithms, the simplest, most general algorithms for each data type are described first, followed by more powerful or more specialized algorithms.


Compression Ratio Speech Signal Vector Quantization Diffuse Data Compression Algorithm 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    The term speech coding is used rather than speech compression, which in speech research and development is sometimes reserved for time-scale modification of the speech signal. For example, as a learning aid, speech may be speeded up during playback.Google Scholar
  2. 2.
    ITU-T G.726 replaces ITU-T G.721 and another recommendation pertaining to ADPCM, ITU-T G.723 [Sayo95]. Later, ITU-T reused the G.723 designation for a new recommendation discussed later.Google Scholar
  3. 3.
    This is a new ITU-T G.723 recommendation. An earlier ITU-T G.723 recommendation pertaining to ADPCM coding was replaced by ITU-T G.726 [Sayo95].Google Scholar
  4. 4.
    The terms audio coding, rather than data compression, and audio coder, rather than data compression algorithm, are used, following the conventions of literature in this field.Google Scholar
  5. 5.
    Red, green, and blue are the additive primary colors for light-emitting systems such as computer displays and scanners. Cyan (a blue-green), magenta, and yellow are the subtractive primary colors that, in combination with black, are used in printing, a light-absorbing process [Penn93, Trow94].Google Scholar
  6. 6.
    READ is an acronym standing for Relative Element Address Designate.Google Scholar
  7. 7.
    The JPEG committee was first chartered in 1986 by ISO and ITU-T. The JPEG standard is a collaborative effort of three major international standards organizations, ISO, ITU-T, and IEC. It is formally specified by the ISO Draft Information Standard 10918 ITU-T Recommendation T.81 [Penn93].Google Scholar
  8. 8.
    The Y luminance component provides a gray-scale version of the image. It is formed as a weighted sum of red, green, and blue color components. The CB and CR luminance components provide the additional information needed to convert the gray-scale image to a color image. They are formed by combinations of the luminance and color components [Penn93].Google Scholar
  9. 9.
    Before the DCT step, the spatial domain pixel values are threshold shifted (by 128 for 8-bit pixels), becoming positive and negative values. Following the IDCT, this threshold shift is undone.Google Scholar
  10. 10.
    In the ISO JPEG standard, see “Table K.1 Luminance quantization table,” as reproduced on page 503 of [Penn93].Google Scholar
  11. 11.
    Interleaving is a “trick” analog television systems use to conserve transmission bandwidth. It exploits psychovisual properties of the human eye, making the frame rate appear to be twice its actual value. This reduces the “flicker” that is visible when frames are displayed at too low a frame rate.Google Scholar
  12. 12.
    The CCIR 601 recommendation (standard) for studio-quality, digitized NTSC television specifies 480 active lines per picture and 720 active pixels per line. Home television receivers can display all the lines but, thanks to the bandwidth-limited standard NTSC transmission channel, can reproduce only 300-350 pixels per line.Google Scholar
  13. 13.
    The H.261 and MPEG standards specify YCBCR format. References to YUV format appear frequently in popular digital video compression literature, where the two formats are used interchangeably. For a comparison of the various color component systems, see [Penn93].Google Scholar
  14. 14.
    Blanking intervals are those times when the scanning process is moving between lines and fields within a frame (shown as horizontal and vertical retrace in Figure 5.11). Only the digital bits that represent active parts of the picture need be coded and transmitted.Google Scholar
  15. 15.
    Frame rate subsampling is only used for applications where the resulting jerkiness in motion can be tolerated. Often, when frame rate subsampling is used, it is done dynamically by the encoder and only if there is no other option for meeting an immutable transmission bandwidth or storage constraint.Google Scholar
  16. 16.
    H.320 actually is the third videoconferencing standard created by ITU-T. The first, H.120, was adopted by the ITU-T in 1984. It operated at 1.5 Mbps and could not compete with proprietary algorithms that offered better picture quality and lower data rates. The same fate afflicted N x 384, a mid-eighties effort that never became a formal standard [Port94, Halh91].Google Scholar
  17. 17.
    H.261 restricts encoding delay to be 150 milliseconds so as not to disrupt interactivity in two-way, face-to-face conversations. Longer delays produce effects similar to those experienced when telephone conversations are transmitted by satellite.Google Scholar
  18. 18.
    MPEG audio was described in Section 5.3.Google Scholar
  19. 19.
    Experience with current video coding algorithms teaches that about 25K luminance pixels per frame and about one-fourth this value for the chrominance pixels (i.e., QCIF resolution) at frame rates that approach 30 progressively-scanned frames per second are needed to achieve acceptable quality video for general applications [Ebra95].Google Scholar
  20. 20.
    In most vector quantization applications, the compression ratio is further increased by applying entropy coding to the codewords before transmission.Google Scholar

Copyright information

© Springer Science+Business Media Dordrecht 1997

Authors and Affiliations

  • Roy Hoffman

There are no affiliations available

Personalised recommendations