Advertisement

DSP Systems using Three-Dimensional (3D) Integration Technology

  • Tong ZhangEmail author
  • Yangyang Pan
  • Yiran Li
Chapter

Abstract

As three-dimensional (3D) integration technology is quickly maturing and steadily entering mainstream markets, it has begun to attract exploding interest from integrated circuit and system designers. This chapter discusses and demonstrates the exciting opportunities and potentials for digital signal processing (DSP) circuit and system designers to exploit 3D integration technology. In particular, this chapter advocates a 3D logic-DRAM integration paradigm and addresses the use of 3D logic-memory integration in both programmable digital signal processors and application-specific digital signal processing circuits. To further demonstrate the potential, this chapter presents case studies on applying 3D logic-DRAM integration to clustered VLIW (very long instruction word) digital signal processors and application-specific video encoders. Since DSP systems using 3D integration technology is still in its research infancy, by presenting some first discussions and results, the aim of this chapter is to motivate greater future efforts from DSP system research community to explore this new and rewarding research area.

Keywords

Motion Estimation Digital Signal Processor Video Encoder Digital Signal Processing System VLIW Processor 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Ababei, C., Feng, Y., Goplen, B., Mogal, H., Zhang, T., Bazargan, K., Sapatnekar, S.: Placement and routing in 3D integrated circuits. IEEE Design & Test of Computers 22, 520–531 (2005)CrossRefGoogle Scholar
  2. 2.
    Abraham, S.G., Mahlke, S.A.: Automatic and efficient evaluation of memory hierarchies for embedded systems. In: 32nd Annual International Symposium on Microarchitecture(MICRO-32), pp. 114–125 (1999)Google Scholar
  3. 3.
    Banerjee, K., Souri, S., Kapur, P., Saraswat, K.: 3-D ICs: A novel chip design for improving deep-submicrometer interconnect performance and systems-on-chip integration. Proceedings of the IEEE 89, 602–633 (2001)CrossRefGoogle Scholar
  4. 4.
    Barth, J., Reohr, W., Parries, P., Fredeman, G., Golz, J., Schuster, S., Matick, R., Hunter, H., Tanner, C., Harig, J., Hoki, K., Khan, B., Griesemer, J., Havreluk, R., Yanagisawa, K., Kirihata, T., Iyer, S.: A 500 MHz random cycle, 1.5 ns latency, SOI embedded DRAM macro featuring a three-transistor micro sense amplifier. IEEE Journal of Solid-State Circuits 43, 86–95 (2008)CrossRefGoogle Scholar
  5. 5.
    Bernstein, K., Andry, P., Cann, J., Emma, P., Greenberg, D., Haensch, W., Ignatowski, M., Koester, S., Magerlein, J., Puri, R., Young, A.: Interconnects in the third dimension: Design challenges for 3D ICs. In: Proc. of ACM/IEEE Design Automation Conference (DAC), pp. 562–567 (2007)Google Scholar
  6. 6.
    Binkert, N.L., Dreslinski, R.G., Hsu, L.R., Lim, K.T., Saidi, A.G., Reinhardt, S.K.: The M5 simulator: Modeling networked systems. IEEE Micro 26(4), 52–60 (2006)CrossRefGoogle Scholar
  7. 7.
    Black, B., Annavaram,M., Brekelbaum, N., DeVale, J., Jiang, L., Loh, G.H.: Die stacking (3D) microarchitecture. In: Proc. of IEEE/ACM International Symposium on Microarchitecture (Micro), pp. 469–479 (2006)Google Scholar
  8. 8.
    Burns, J., Aull, B., Chen, C., Chen, C., Keast, C., Knecht, J., Suntharalingam, V., Warner, K., Wyatt, P., Yost, D.: A wafer-scale 3-D circuit integration technology. IEEE Trans. Electron Devices 53, 2507–2516 (2006)CrossRefGoogle Scholar
  9. 9.
    CACTI: An integrated cache and memory access time, cycle time, area, leakage, and dynamic power model. http://www.hpl.hp.com/research/cacti/
  10. 10.
    Chakrapani, L.N., Gyllenhaal, J., Hwu, W.W.,Mahlke, S.A., Palem, K.V., Rabbah, R.M.: Trimaran: An infrastructure for research instruction-level parallelism. In: Languages and Compilers for High Performance Computing. Lecture Notes in Computer Science, pp. 32–41 (2005)Google Scholar
  11. 11.
    Chau, L.P., Jing, X.: Efficient three-step search algorithm for block motion estimation video coding. In: Proc. of IEEE International Conference on Acoustics, Speech, and Signal Processing, pp. 421–424 (2003)Google Scholar
  12. 12.
    Chen, K.N., Lee, S., Andry, P., Tsang, C., Topol, A., Lin, Y., Lu, J., Young, A., Ieong, M., Haensch, W.: Structure design and process control for Cu Bonded Interconnects in 3D integrated circuits. In: Technical Digest of IEEE International Electron Devices Meeting (IEDM), pp. 367–370 (2006)Google Scholar
  13. 13.
    Chen, M., Chen, E., Lai, J.Y., Wang, Y.P.: Thermal investigation for multiple chips 3D packages. In: Proc. of Electronics Packaging Technology Conference, pp. 559–564 (2008)Google Scholar
  14. 14.
    Claasen, T.: An industry perspective on current and future state of the art in system-on-chip (SoC) Technology. Proceedings of the IEEE 94, 1121–1137 (2006)CrossRefGoogle Scholar
  15. 15.
    Cong, J.: An interconnect-centric design flow for nanometer technologies. Proceedings of the IEEE 89, 505–528 (2001)CrossRefGoogle Scholar
  16. 16.
    Cong, J., Luo, G.: A multilevel analytical placement for 3D ICs. In: Proc. of Asia and South Pacific Design Automation Conference, pp. 361–366 (2009)Google Scholar
  17. 17.
    Cong, J., Zhang, Y.: Thermal via planning for 3-D ICs. In: Proc. of IEEE/ACM International Conference on Computer-Aided Design (ICCAD), pp. 745–752 (2005)Google Scholar
  18. 18.
    Crowley, M., Al-Shamma, A., Bosch, D., Farmwald, M., Fasoli, L.: 512 Mb PROM with 8 layers of antifuse/diode cells. In: IEEE Intl. Solid-State Circuit Conf. (ISSCC), p. 284 (2003)Google Scholar
  19. 19.
    Emma, P.G., Kursun, E.: Is 3D chip technology the next growth engine for performance improvement? IBM Journal of Research and Development 52(6), 541–552 (2008)CrossRefGoogle Scholar
  20. 20.
    Fisher, J.A., Faraboschi, P., Young, C.: Embedded Computing: A VLIW Approach to Architecture, Compilers and Tools. Morgan Kaufmann (2004)Google Scholar
  21. 21.
    Gibert, E., Sanchez, J., Gonzales, A.: Effective instruction scheduling techniques for an interleaved cache clustered VLIW processor. In: Proceedings of the 35th Annual IEEE/ACM International Symposium on Microarchitecture, pp. 123–133 (2002)Google Scholar
  22. 22.
    Gibert, E., Sanchez, J., Gonzales, A.: Local scheduling techniques for memory coherence in a clustered VLIW processor with a distributed data cache. In: Proceedings of the International Symposium on Code Generation and Optimization, pp. 193–203 (2003)Google Scholar
  23. 23.
    Goplen, B., Sapatnekar, S.: Placement of thermal vias in 3-D ICs using various thermal objectives. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems 25, 692–709 (2006)CrossRefGoogle Scholar
  24. 24.
    Harrer, H., Katopis, G., Becker, W.: From chips to systems via packaging - A comparison of IBM’s mainframe servers. IEEE Circuits and Systems Magazine 6, 32–41 (2006)CrossRefGoogle Scholar
  25. 25.
    Healy, M., Vittes, M., Ekpanyapong, M., Ballapuram, C.S., Lim, S.K., Lee, H.H.S., Loh, G.H.: Multiobjective microarchitectural floorplanning for 2-D and 3-D ICs. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems 26, 38–52 (2007)CrossRefGoogle Scholar
  26. 26.
    Ho, R., Mai, K.W., Horowitz, M.A.: The future of wires. Proceedings of the IEEE 89, 490–504 (2001)CrossRefGoogle Scholar
  27. 27.
    Horowitz, M., Stark, D., Alon, E.: Digital circuit design trends. IEEE Journal of Solid-State Circuits 43, 757–761 (2008)CrossRefGoogle Scholar
  28. 28.
    Huang, Y., Chen, T.C., Tsai, C.H., Chen, C.Y., Chen, T.W., Chen, C.S., Shen, C.F., Ma, S.Y., Wang, T.C., Hsieh, B.Y., Fang, H.C., Chen, L.G.: A 1.3TOPS H.264/AVC single-chip encoder for HDTV applications. In: International Solid-State Circuit Conference, pp. 128–129. San Francisco (2005)Google Scholar
  29. 29.
    Hwang, C.G.: New paradigms in the silicon industry. In: Proc. of Technical Digest of IEEE International Electron Devices Meeting (IEDM), pp. 19–26 (2006)Google Scholar
  30. 30.
    Itoh, K.: VLSI Memory Chip Design. Springer (2001)Google Scholar
  31. 31.
    The International Technology Roadmap for Semiconductors (ITRS). http://www.itrs.net/reports.html
  32. 32.
    Jung, S.M., Jang, J., Cho, W., Moon, J., Kwak, K.: The revolutionary and truly 3-dimensional 25F2 SRAM technology with the smallest S3 (stacked single-crystal Si) cell, 0.16μm2, and SSTFT (stacked single-crystal thin film transistor) for ultra high density SRAM. In: Proc. of Symposium on VLSI Technology, pp. 228–229 (2004)Google Scholar
  33. 33.
    Jung, S.M., Jang, J., Kim, K.: Three dimensionally stacked NAND flash memory technology using stacked single crystal Si Layers in ILD and TANOS structure for beyond 30nm node. In: Technical Digest of IEEE International Electron Devices Meeting (IEDM), pp. 37–40 (2006)Google Scholar
  34. 34.
    Kathail, V., Schlansker, M.S., Rau, B.R.: HPL-PD architecture specification: Version 1.1. Tech. rep., Hewlett-Packard Company (2000)Google Scholar
  35. 35.
    Kgil, T., D’Souza, S., Saidi, A., Binkert, N., Dreslinski, R., Reinhardt, S., Flautner, K.,Mudge, T.: PicoServer: Using 3D stacking technology to enable a compact energy efficient chip multiprocessor. In: Proc. of 12th International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS) (2006)Google Scholar
  36. 36.
    Khailany, B.K., Williams, T., Lin, J., Long, E.P., Rygh, M., Tovey, D.W., Dally, J.W.: A programmable 512 GOPS stream processor for signal, image, and video processing. IEEE Journal of Solid-State Circuits 43(1), 202–213 (2008)CrossRefGoogle Scholar
  37. 37.
    Kim, M., Hwang, I., Chae, S.I.: A fast VLSI architecture for full-search variable block size motion estimation in MPEG-4 AVC/H.264. In: Proc. Asia and South Pacific Design Automation Conference, pp. 631–634 (2005)Google Scholar
  38. 38.
    Lapsley, P., Bier, J., Shoham, A., Lee, E.A.: DSP Processor Fundamentals: Architectures and Features. IEEE Press (1997)Google Scholar
  39. 39.
    Lee, C., Potkonjak, M., Mangione-Smith, W.: MediaBench: A tool for evaluating and synthesizing multimedia and communications systems. In: Proc. of IEEE/ACM International Symposium on Microarchitecture, pp. 330–335 (1997)Google Scholar
  40. 40.
    Li, R., Zeng, B., Liou, M.L.: A new three-step search algorithm for block motion estimation. IEEE Trans. on Circuits and Systems for Video Technology 4(4), 438–442 (1994)CrossRefGoogle Scholar
  41. 41.
    Liu, C.C., Ganusov, I., Burtscher, M., Tiwari, S.: Bridging the processor-memory performance gap with 3D IC technology. IEEE Design and Test of Computers 22, 556–564 (2005)CrossRefGoogle Scholar
  42. 42.
    Loh, G.: 3D-stacked memory architecture for multi-core processors. In: Proceedings of the 35th ACM/IEEE Intl. Conf. on Computer Architecture (2008)Google Scholar
  43. 43.
    Loh, G., Xie, Y., Black, B.: Processor design in 3D die-stacking technologies. IEEE Micro 27, 31–48 (2007)CrossRefGoogle Scholar
  44. 44.
    Lu, J.Q.: 3-D hyperintegration and packaging technologies for micro-nano systems. Proceedings of the IEEE 97, 18–30 (2009)CrossRefGoogle Scholar
  45. 45.
    Lu, J.Q., Cale, T., Gutmann, R.: Wafer-level three-dimensional hyper-integration technology using dielectric adhesive wafer bonding. Materials for Information Technology: Devices, Interconnects and Packaging (Eds. E. Zschech, C. Whelan, T. Mikolajick) pp. 386–397 (Springer-Verlag Ltd(London), August 2005)Google Scholar
  46. 46.
    Matick, R., Schuster, S.: Logic-based eDRAM: Origins and rationale for use. IBM J. Res. & Dev. 49, 145–165 (2005)Google Scholar
  47. 47.
    Moor, P.D., Ruythooren, W., Soussan, P., Swinnen, B., Baert, K., Hoof, C.V., Beyne, E.: Recent advances in 3D integration at IMEC. Enabling Technologies for 3-D Integration (edited by C.A. Bower, P.E. Garrou, P. Ramm, and K. Takahashi) (2006)Google Scholar
  48. 48.
    Morrow, P., Black, B., Kobrinsky, M., Muthukumar, S., Nelson, D., Park, C.M., Webb, C.: Design and fabrication of 3D microprocessors. Enabling Technologies for 3-D Integration (edited by C.A. Bower, P.E. Garrou, P. Ramm, and K. Takahashi) (2006)Google Scholar
  49. 49.
    Panda, P.R., Catthoor, F., Dutt, N.D., Danckaert, K., Brockmeyer, E., Kulkarni, C., Kjeldsberg, P.G.: Data and memory optimization techniques for embedded systems. ACM Transactions on Design Automation of Electronic Systems 6, 149–206 (2001)CrossRefGoogle Scholar
  50. 50.
    Po, L.M., Ma, W.C.: A novel four-step search algorithm for fast block motion estimation. IEEE Trans. on Circuits and Systems for Video Technology 6, 313–317 (1996)CrossRefGoogle Scholar
  51. 51.
    Pozder, S., Chatterjee, R., Jain, A., Huang, Z., Jones, R., Acosta, E.: Progress of 3D integration technologies and 3D interconnects. In: Proc. of IEEE International Interconnect Technology Conference, pp. 213–215 (2007)Google Scholar
  52. 52.
    Pozder, S., Jones, R., Adams, V., Li, H.F., Canonico, M., Zollner, S., Lee, S., Gutmann, R., Lu, J.Q.: Exploration of the scaling limits of 3D integration. Enabling Technologies for 3-D Integration (edited by C.A. Bower, P.E. Garrou, P. Ramm, and K. Takahashi) (2006)Google Scholar
  53. 53.
    Rickert, P., Krenik, W.: Cell phone integration: SiP, SoC, and PoP. IEEE Design & Test of Computers 23(3), 188–195 (2006)CrossRefGoogle Scholar
  54. 54.
    Sapatnekar, S.: Addressing thermal and power delivery bottlenecks in 3D circuits. In: Proc. of Asia and South Pacific Design Automation Conference, pp. 423–428 (2009)Google Scholar
  55. 55.
    SEMATECH Consortium. http://www.sematech.org
  56. 56.
    Strauss, W.: The real DSP chip market. IEEE Signal Processing Magazine 20, 83 (2003)CrossRefGoogle Scholar
  57. 57.
    Sveriges Television (SVT): http://www.svt.se
  58. 58.
    Tham, J.Y., Ranganath, S., Ranganath, M., Kassim, A.A.: A novel unrestricted center-biased diamond search algorithms for block motion estimation. IEEE Trans. on Circuits and Systems for Video Technology 8, 369–377 (1998)CrossRefGoogle Scholar
  59. 59.
    Tham, J.Y., Ranganath, S., Ranganath, M., Kassim, A.A.: A new diamond search algorithm for fast block-matching motion estimation. IEEE Trans. on Image processing 9, 287–290 (2000)CrossRefGoogle Scholar
  60. 60.
    Wang, K.T., Chen, O.C.: Motion estimation using an efficient four-step search method. In: Proc. of IEEE International Symposium on Circuits and Systems, pp. 217–220 (1998)Google Scholar
  61. 61.
    Wang, R., Li, J., Huang, C.: Motion compensation memory access optimization strategies for H.264/AVC decoder. In: Proc. of IEEE International Conference on Acoustics, Speech, and Signal Processing, pp. 97–100 (2005)Google Scholar
  62. 62.
    Wiegand, T., Sullivan, G.J., Bjontegaard, G., Luthra, A.: Overview of the H.264/AVC video coding standard. IEEE Trans. on Circuits and Systems on Video Technology 13(7), 560–576 (2003)CrossRefGoogle Scholar
  63. 63.
    Wu, Z., Wolf, W.: Design study of shared memory in VLIW video signal processors. In: Proceedings of the 1998 International Conference on Parallel Architectures and Compilation Techniques, pp. 52–59 (1998)Google Scholar
  64. 64.
    Wulf, W.A., McKee, S.A.: Hitting the memory wall: Implications of the obvious. Computer Architecture News 23, 20–24 (1995)CrossRefGoogle Scholar

Copyright information

© Springer Science+Business Media, LLC 2010

Authors and Affiliations

  1. 1.Rensselaer Polytechnic InstituteTroyUSA

Personalised recommendations