A Compact DSP Core with Static Floating-Point Arithmetic
- 116 Downloads
A multimedia system-on-a-chip (SoC) usually contains one or more programmable digital signal processors (DSP) to accelerate data-intensive computations. But most of these DSP cores are designed originally for standalone applications, and they must have some overlapped (and redundant) components with the host microprocessor. This paper presents a compact DSP for multi-core systems, which is fully programmable and has been optimized to execute a set of signal processing kernels very efficiently. The DSP core was designed concurrently with its automatic software generator based on high-level synthesis. Moreover, it performs lightweight arithmetic—the static floating-point (SFP), which approximates the quality of floating-point (FP) operations with the hardware similar to that of the integer arithmetic. In our simulations, the compact DSP and its auto-generated software can achieve 3X performance (estimated in cycles) of those DSP cores in the dual-core baseband processors with similar computing resources. Besides, the 16-bit SFP has above 40 dB signal to round-off noise ratio over the IEEE single-precision FP, and it even outperforms the hand-optimized programs based on the 32-bit integer arithmetic. The 24-bit SFP has above 64 dB quality, of which the maximum precision is identical to that of the single-precision FP. Finally, the DSP core has been implemented and fabricated in the UMC 0.18µm 1P6M CMOS technology. It can operate at 314.5 MHz while consuming 52mW average power. The core size is only 1.5 mm×1.5 mm including the 16 KB on-chip memory and the AMBA AHB interface.
KeywordsDiscrete Cosine Transform Digital Signal Processor Input Queue Virtual Address Integer Arithmetic
Unable to display preview. Download preview PDF.
- 2.Intel PXA800F Cellular Processor Development Manual, Intel Corp., Feb. 2003.Google Scholar
- 3.OMAP5910 Dual Core Processor Technical Reference Manual, Texas Instruments, Jan. 2003.Google Scholar
- 4.M. Levy, “ARM picks up performance,” Microprocessor Report, 4/7/03-01.Google Scholar
- 5.R.A. Quinnell, “Logical combination? Convergence Products Need Both RISC and DSP Processors, but Merging them may not be the Answer,” EDN, 1/23/2003.Google Scholar
- 6.TriCore 2-32-bit Unified Processor Core v.2.0 Architecture—Architecture Manual, Infineon Technology, June 2003.Google Scholar
- 7.J.L. Hennessy and D.A. Patterson, Computer Architecture—A Quantitative Approach, 3rd Edition, Morgan Kaufmann, 2002.Google Scholar
- 8.IEEE Standard for Binary Floating-Point Arithmetic, IEEE Standard 754, 1985.Google Scholar
- 9.W.B. Pennebaker and J.L. Mitchell, JPEG Still Image Data Compression Standard, Van Nostrand Reinhold, 1993.Google Scholar
- 10.Independent JPEG Group, http://www.ijg.org.
- 11.Digital Signal Processing Using the ADSP-2100 Family, Analog Device Inc., 1990.Google Scholar
- 12.K.K. Parhi, VLSI Digital Signal Processing Systems—Design and Implementation, John Wiley & Sons, 1999.Google Scholar
- 14.D.D. Gajski et al., High Level Synthesis—Introduction to Chip and System Design, Kluwer Academic Publisher, 1992.Google Scholar
- 15.P. Lapsley, J. Bier and E.A. Lee, DSP Processor Fundamentals—Architectures and Features, IEEE Press, 1996.Google Scholar
- 16.F. Fang, R. Rutenbar, M. Puschel and T. Chen, “Toward Efficient Static Analysis of Finite-precision Effects in DSP Applications via Affine Arithmetic Modeling,” in Proc. DAC, 2003, pp. 496–501.Google Scholar
- 17.The SUIF Compiler Infrastructure, http://suif.stanford.edu/.
- 18.LINDO API User 's Manual, LINDO System Inc., 2002.Google Scholar
- 19.TMS320C55x DSP Programmer's Guide, Texas Instruments Inc., July 2000.Google Scholar
- 20.IEEE Standard for In-System Configuration of Programmable Devices, IEEE Standard 1532, 2002.Google Scholar