Skip to main content

Codesign Case Study on Transport-Triggered Architectures

  • Living reference work entry
  • First Online:
Handbook of Hardware/Software Codesign

Abstract

Application-specific processors are used to obtain the efficiency of fixed-function application-specific integrated circuits and flexibility of software implementations on programmable processors. The efficiency is achieved by tailoring the processor architecture according to the requirements of the application while the flexibility is provided by the programmability. In this chapter, we introduce a hardware/software codesign environment for developing application-specific processors, which is using processor templates based on the transport-triggering paradigm, hence the name transport-triggered architecture (TTA). Fast Fourier transform (FFT) is used as an example application to illustrate the customization. Specific features of FFTs are discussed, and we show how those can be exploited in FFT implementations. We have customized a TTA processor for FFT, and its energy efficiency is compared against several other FFT implementations to prove the potential of the concept.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Institutional subscriptions

References

  1. Baek JH, Kim SD, Sunwoo MH (2008) SPOCS: application specific signal processor for OFDM communication systems. J Signal Process Syst 53(3):383–397. DOI 10.1007/s11265-008-0240-4

    Article  Google Scholar 

  2. Chang WH, Nguyen TQ (2008) On the fixed-point accuracy analysis of FFT algorithms. IEEE Trans Signal Proc 56(10):4673–4682

    Article  MathSciNet  Google Scholar 

  3. Chang YN, Parhi KK (1999) Efficient FFT implementation using digit-serial arithmetic. In: Proceedings of IEEE international workshop signal processing system, Taipei, pp 645–653. DOI 10.1109/SIPS.1999. 822371

    Google Scholar 

  4. Chen CM, Hung CC, Huang YH (2010) An energy-efficient partial FFT processor for the OFDMA communication system. IEEE Trans Circuits Syst II 57(2):136–140. DOI 10.1109/TCSII.2010.2040318

    Article  Google Scholar 

  5. Cheng KT, Wang YC (2011) Using mobile GPU for general-purpose computing: a case study of face recognition on smartphones. In: Proceedings of international symposium VLSI design automation test, Hsinchu, pp 1–4. DOI 10.1109/VDAT.2011.5783575

    Google Scholar 

  6. Chi JC, Chen SG (2004) An efficient FFT twiddle factor generator. In: Proceeding of European signal processing conference, Vienna, pp 1533–1536

    Google Scholar 

  7. Chu E, George, A (2000) Inside the FFT black box: serial and parallel fast Fourier transform algorithms. CRC Press, Boca Raton

    MATH  Google Scholar 

  8. Cichon G, Robelly P, Seidel H, Matúš E, Bronzel M, Fettweis G (2004) Synchronous transfer architecture (STA). In: Computer systems: architectures, modeling, and simulation. Lecture notes in computer science, vol 3133. Springer, Berlin/Heidelberg, pp 193–207. DOI 10. 1007/978-3-540-27776-7∖_36

    Google Scholar 

  9. Cooley JW, Tukey JW (1965) An algorithm for the machine calculation of complex Fourier series. Math Comput 19(90):297–301

    Article  MathSciNet  MATH  Google Scholar 

  10. Corporaal H (1997) Microprocessor architectures: from VLIW to TTA. Wiley, Chichester

    Google Scholar 

  11. Corporaal H, Mulder H (1991) MOVE: a framework for high-performance processor design. In: Proceedings of ACM/IEEE conference on supercomputing, Albuquerque, pp 692–701. DOI 10.1145/125826.126159

    Google Scholar 

  12. Dally W, Balfour J, Black-Shaffer D, Chen J, Harting R, Parikh V, Park J, Sheffield D (2008) Efficient embedded computing. Computer 41:27–32. DOI 10.1109/MC.2008.224

    Article  Google Scholar 

  13. Despain AM (1974) Fourier transform computers using CORDIC iterations. IEEE Trans Comput C-23(10):993–1001. DOI 10.1109/T-C. 1974.223800

    Article  MATH  Google Scholar 

  14. Fanucci L, Roncella R, Saletti R (2001) A sine wave digital synthesizer based on a quadratic approximation. In: Proceedings of IEEE international frequency control symposium PDA exhibition, pp 806–810. DOI 10.1109/FREQ.2001.956385

    Google Scholar 

  15. Garrido M, Grajal J (2007) Efficient memoryless CORDIC for FFT computation. In: Proceedings of IEEE international conference acoustics speech signal processing, Honolulu, vol 2, pp 113–116. DOI 10.1109/ ICASSP.2007.366185

    Google Scholar 

  16. Guan X, Fei Y, Lin H (2012) Hierarchical design of an application-specific instruction set processor for high-throughput and scalable FFT processing. IEEE Trans VLSI Syst 20(3):551–563. DOI 10. 1109/TVLSI.2011.2105512

    Google Scholar 

  17. Hasan M, Arslan T (2002) FFT coefficient memory reduction technique for OFDM applications. In: IEEE international conference acoustics speech signal process, Orlando, vol 1, pp 1085–1088

    Google Scholar 

  18. He Y, She D, Mesman B, Corporaal H (2011) MOVE-Pro: a low power and high code density TTA architecture. In: Proceedings of international conference on embedded computer system: architectures modeling simulation, pp 294–301. DOI 10.1109/SAMOS.2011.6045474

    Google Scholar 

  19. Heikkinen J, Takala J, Corporaal H (2009) Dictionary-based program compression on customizable processor architectures. Microprocess Microsyst 33(2):139–153. DOI 10.1016/j.micpro.2008.10.001

    Article  Google Scholar 

  20. IEEE 802.16.1 (2012) IEEE standard for wireless MAN – advanced air interface for broadband wireless access systems. Std 802.16.1–2012. IEEE

    Google Scholar 

  21. Jääskeläinen P, Kultala H, Viitanen T, Takala J (2014) Code density and energy efficiency of exposed datapath architectures. J Signal Process Syst 1–16. DOI 10.1007/s11265-014-0924-x

    Google Scholar 

  22. Jääskeläinen P, de La Lama C, Huerta P, Takala J (2011) OpenCL-based design methodology for application-specific processors. Transactions on HiPEAC 5. Available online

    Google Scholar 

  23. Jääskeläinen P, de La Lama CS, Schnetter E, Raiskila K, Takala J, Berg H (2014) pocl: a performance-portable OpenCL implementation. Int J Parallel Prog 1–34. DOI 10.1007/s10766-014-0320-y

    Google Scholar 

  24. Jääskeläinen P, Salminen E, de La Lama C, Takala J, Ignacio Martinez J (2011) TCEMC: a co-design flow for application-specific multicores. In: Proceeding of international conference on embedded computer system: architectures modeling and simulations, Samos, pp 85–92. DOI 10.1109/SAMOS.2011.6045448

    Google Scholar 

  25. Jiang RM (2007) An area-efficient FFT architecture for OFDM digital video broadcasting. IEEE Trans Consum Electron 53(4):1322–1326. DOI 10.1109/TCE.2007.4429219

    Article  Google Scholar 

  26. Johnson H, Burrus C (1984) An in-order, in-place radix-2 FFT. In: IEEE international conference on acoustics speech signal processing, vol 9, San Diego, pp 473–476. DOI 10.1109/ICASSP.1984.1172660

    Google Scholar 

  27. Johnsson SL, Krawitz RL, Frye R, MacDonald D (1989) A radix-2 FFT on connection machine. In: Proceeding of ACM/IEEE conference on supercomputing, Reno, pp 809–819. DOI 10.1145/76263.76355

    Google Scholar 

  28. Jui PC, Wey CL, Shiue MT (2013) Low-cost parallel FFT processors with conflict-free ROM-based twiddle factor generator for DVB-T2 applications. In: Proceedings of IEEE international midwest symposium circuits system, Columbus, pp 1003–1006

    Google Scholar 

  29. Lattner C, Adve V (2004) LLVM: a compilation framework for lifelong program analysis & transformation. In: Proceedings of the 2004 international symposium on code generation and optimization (CGO’04), Palo Alto

    Google Scholar 

  30. Ma Y, Wanhammar L (2000) A hardware efficient control of memory addressing for high-performance FFT processors. IEEE Trans Signal Process 48(3):917–921. DOI 10.1109/78.824693

    Article  Google Scholar 

  31. Oppenheim AV, Schafer RW (2010) Discrete-time signal processing, 3rd edn. Pearson, Upper Saddle River

    MATH  Google Scholar 

  32. Pitkänen T, Partanen T, Takala J (2007) Low-power twiddle factor unit for FFT computation. In: Vassiliadis S, Berekovic M, Hämäläinen T (eds) Embedded computer systems: architectures, modeling, and simulation. Proceeding of 7th international workshop SAMOS VII, vol LNCS 4599. Springer, Berlin, pp 233–240. DOI 10. 1007/978-3-540-73625-7∖_9

    Google Scholar 

  33. Pitkänen T, Takala J (2011) Low-power application-specific processor for FFT computations. J Signal Process Syst 63(1):165–176. DOI 10. 1007/s11265-010-0528-z

    Google Scholar 

  34. Senthilvelan M, Sima M, Iancu D, Schulte M, Glossner J (2013) Instruction set extensions for matrix decompositions on software defined radio architectures. J Signal Process Syst 70:289–303. DOI 10.1007/ s11265-012-0665-7

    Article  Google Scholar 

  35. Singleton R (1967) A method for computing the fast Fourier transform with auxiliary memory and limited high-speed memory. IEEE Trans Audio Electroacoust 15(2):91–98

    Article  Google Scholar 

  36. Strang G (1994) Wavelets. Am Sci 82(3):250–255

    MATH  Google Scholar 

  37. Suleiman A, Saleh H, Hussein A, Akopian D (2008) A family of scalable FFT architectures and an implementation of 1024-point radix-2 FFT for real-time communications. In: IEEE international conference on computer design, Lake Tahoe, pp 321–327. DOI 10.1109/ICCD.2008. 4751880

    Google Scholar 

  38. Tang SN, Liao CH, Chang TY (2012) An area- and energy-efficient multimode FFT processor for WPAN/WLAN/WMAN systems. IEEE J Solid-State Circuits 47(6):1419–1435. DOI 10.1109/JSSC.2012.2187406

    Article  Google Scholar 

  39. Tang Y, Qian L, Wang Y, Savaria Y (2003) A new memory reference reduction method for FFT implementation on DSP. In: Proceedings of ISCAS, Bangkok, vol 4, pp 496–499. DOI 10.1109/ISCAS.2003.1205932

    Google Scholar 

  40. TTA-based co-design environment (2015). http://tce.cs.tut.fi. Accessed: 15 Jan 2016

  41. Texas Instruments, Inc. (2003) TMS320C64x DSP Library programmer’s reference, Dallas

    Google Scholar 

  42. Thuresson M, Själander M, Björk M, Svensson L, Larsson-Edefors P, Stenström P (2007) FlexCore: utilizing exposed datapath control for efficient computing. In: Proceedings of international conference on embedded computer system: architectures modeling simulation, Samos, pp 18–25. DOI 10.1109/ICSAMOS.2007.4285729

    Google Scholar 

  43. Viitanen T, Kultala H, Jääskeläinen P, Takala J (2014) Heuristics for greedy transport triggered architecture interconnect exploration. In: Proceedings of international conference compilers architecture synthesis embedded system, New Delhi, pp 2:1–2:7. DOI 10.1145/ 2656106.2656123

    Google Scholar 

  44. Volder JE (1959) The CORDIC trigonometric computing technique. IRE Trans Electron Comput EC–8(3):330–334. DOI 10.1109/TEC.1959. 5222693

    Google Scholar 

  45. Wang W, Li L, Zhang G, Liu D, Qiu J (2011) An application specific instruction set processor optimized for FFT. In: IEEE international midwest symposium circuits and systems, Seoul, pp 1–4. DOI 10.1109/ MWSCAS.2011.6026391

    Google Scholar 

  46. Wanhammar L (1999) DSP integrated circuits. Academic Press, San Diego

    Book  Google Scholar 

  47. Yu CY, Chen SG, Chih JC (2006) Efficient CORDIC designs for multi-mode OFDM FFT. In: Proceedings IEEE international conference acoustics speech signal processing, vol 3, Toulouse, pp III-1036–III-1039. DOI 10.1109/ICASSP.2006.1660834

    Google Scholar 

Download references

Acknowledgements

The authors thank the Finnish Funding Agency for Innovation in the context of the FiDiPro project StreamPro (decision no. 40142/14).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jarmo Takala .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2016 Springer Science+Business Media Dordrecht

About this entry

Cite this entry

Takala, J., Jääskeläinen, P., Pitkänen, T. (2016). Codesign Case Study on Transport-Triggered Architectures. In: Ha, S., Teich, J. (eds) Handbook of Hardware/Software Codesign. Springer, Dordrecht. https://doi.org/10.1007/978-94-017-7358-4_39-1

Download citation

  • DOI: https://doi.org/10.1007/978-94-017-7358-4_39-1

  • Received:

  • Accepted:

  • Published:

  • Publisher Name: Springer, Dordrecht

  • Print ISBN: 978-94-017-7358-4

  • Online ISBN: 978-94-017-7358-4

  • eBook Packages: Springer Reference EngineeringReference Module Computer Science and Engineering

Publish with us

Policies and ethics