The hArtes Tool Chain

  • Koen Bertels
  • Ariano Lattanzi
  • Emanuele Ciavattini
  • Ferruccio Bettarelli
  • Maria Teresa Chiaradia
  • Raffaele Nutricato
  • Alberto Morea
  • Anna Antola
  • Fabrizio Ferrandi
  • Marco Lattuada
  • Christian Pilato
  • Donatella Sciuto
  • Roel J. Meeuws
  • Yana Yankova
  • Vlad Mihai Sima
  • Kamana Sigdel
  • Wayne Luk
  • Jose Gabriel de Figueiredo Coutinho
  • Yuet Ming Lam
  • Tim Todman
  • Andrea Michelotti
  • Antonio Cerruto

Abstract

This chapter describes the different design steps needed to go from legacy code to a transformed application that can be efficiently mapped on the hArtes platform.

References

  1. 1.
    Agosta, G., Bruschi, F., Sciuto, D.: Static analysis of transaction-level models. In: DAC 03: Proceedings of the 40th Conference on Design Automation, pp. 448–453 (2003) CrossRefGoogle Scholar
  2. 2.
    Balarin, F., Watanabe, Y., Hsieh, H., Lavagno, L., Passerone, C., Sangiovanni-Vincentelli, A.: Metropolis: an integrated electronic system design environment. Computer 36, 45–52 (2003) CrossRefGoogle Scholar
  3. 3.
    Banerjee, U., Eigenmann, R., Nicolau, A., Padua, D.A.: Automatic program parallelization. Proc. IEEE 81(2), 211–243 (1993) CrossRefGoogle Scholar
  4. 4.
    Beltrame, G., Fossati, L., Sciuto, D.: ReSP: a nonintrusive transaction-level reflective MPSoC simulation platform for design space exploration. IEEE Trans. Comput.-Aided Des. Integr. Circuits Syst. 28(12), 1857–1869 (2009) CrossRefGoogle Scholar
  5. 5.
    Carr, S., Kennedy, K.: Scalar replacement in the presence of conditional control flow. Softw. Pract. Exp. 24(1), 51–77 (1994) CrossRefGoogle Scholar
  6. 6.
  7. 7.
    Eker, J., Janneck, J.W., Lee, E.A., Liu, J., Liu, X., Ludvig, J., Neuendorffer, S., Sachs, S., Xiong, Y.: Taming heterogeneity—the Ptolemy approach. Proc. IEEE 91, 127–144 (2003) CrossRefGoogle Scholar
  8. 8.
    Ferrandi, F., Lattuada, M., Pilato, C., Tumeo, A.: Performance modeling of parallel applications on MPSoCs. In: Proceedings of IEEE International Symposium on System-on-Chip 2009 (SOC 2009), Tampere, Finland, pp. 64–67 (2009) Google Scholar
  9. 9.
    Ferrandi, F., Lattuada, M., Pilato, C., Tumeo, A.: Performance estimation for task graphs combining sequential path profiling and control dependence regions. In: Proceedings of ACM/IEEE International Conference on Formal Methods and Models for Codesign (MEMOCODE 2009), Cambridge, MA, USA, pp. 131–140 (2009) Google Scholar
  10. 10.
    Ferrandi, F., Pilato, C., Tumeo, A., Sciuto, D.: Mapping and scheduling of parallel C applications with ant colony optimization onto heterogeneous reconfigurable MPSoCs. In: Proceedings of IEEE Asia and South Pacific Design Automation Conference 2010 (ASPDAC 2010), Taipei, Taiwan, pp. 799–804 (2010) CrossRefGoogle Scholar
  11. 11.
    Ferrante, J., Ottenstein, K.J., Warren, J.D.: The program dependence graph and its use in optimization. ACM Trans. Program. Lang. Syst. 9(3), 319–349 (1987) MATHCrossRefGoogle Scholar
  12. 12.
    Franke, B., OBoyle, M.F.P.: A complete compiler approach to auto-parallelizing C programs for multi-DSP systems. IEEE Trans. Parallel Distrib. Syst. 16(3), 234–245 (2005) CrossRefGoogle Scholar
  13. 13.
    GCC, the GNU Compiler Collection, version 4.3: http://gcc.gnu.org/
  14. 14.
    Girkar, M., Polychronopoulos, C.D.: Automatic extraction of functional parallelism from ordinary programs. IEEE Trans. Parallel Distrib. Syst. 3(2), 166–178 (1992) CrossRefGoogle Scholar
  15. 15.
    Graham, S.L., Kessler, P.B., Mckusick, M.K.: Gprof: a call graph execution profiler. SIGPLAN Not. 17(6), 120–126 (1982) CrossRefGoogle Scholar
  16. 16.
    Hou, E.S.H., Ansari, N., Ren, H.: A genetic algorithm for multiprocessor scheduling. IEEE Trans. Parallel Distrib. Syst. 5, 113–120 (1994) CrossRefGoogle Scholar
  17. 17.
  18. 18.
    IEEE Standard for VHDL Register Transfer Level (RTL) Synthesis, IEEE Std 1076.6-2004, available online at http://ieeexplore.ieee.org/servlet/opac?punumber=9308
  19. 19.
    IEEE Standard VHDL Synthesis Packages, IEEE Std 1076.3-1997, available online at http://ieeexplore.ieee.org/servlet/opac?punumber=4593
  20. 20.
    IEEE Standard for Floating-Point Arithmetic. IEEE Std 754-2008, vol. 29, pp. 1–58 (2008) Google Scholar
  21. 21.
    Ierotheou, C.S., Johnson, S.P., Cross, M., Leggett, P.F.: Computer aided parallelisation tools (CAPTools)—conceptual overview and performance on the parallelisation of structured mesh codes. Parallel Comput. 22(2), 163–195 (1996) CrossRefGoogle Scholar
  22. 22.
    Ismail, T.B., Abid, M., Jerraya, A.: COSMOS: a codesign approach for communicating systems. In: CODES’94: Proceedings of the 3rd International Workshop on Hardware/Software Co-design, pp. 17–24 (1994) Google Scholar
  23. 23.
    Jin, H., Frumkin, M.A., Yan, J.: Automatic generation of OpenMP directives and its application to computational fluid dynamics codes. In: Proceedings of the Third International Symposium on High Performance Computing (ISHPC 00), London, UK, pp. 440–456. Springer, Berlin (2000) Google Scholar
  24. 24.
    Jin, H., Jost, G., Yan, J., Ayguade, E., Gonzalez, M., Martorell, X.: Automatic multilevel parallelization using OpenMP. Sci. Program. 11(2), 177–190 (2003) Google Scholar
  25. 25.
    Johnson, T.A., Eigenmann, R., Vijaykumar, T.N.: Min-cut program decomposition for thread-level speculation. In: Programming Language Design and Implementation, Washington, DC, USA (2004) Google Scholar
  26. 26.
    Kianzad, V., Bhattacharyya, S.S.: Efficient techniques for clustering and scheduling onto embedded multiprocessors. IEEE Trans. Parallel Distrib. Syst. 17(17), 667–680 (2006) CrossRefGoogle Scholar
  27. 27.
    Kim, S.J., Browne, J.C.: A general approach to mapping of parallel computation upon multiprocessor architectures. In: Int. Conference on Parallel Processing, pp. 1–8 (1988) Google Scholar
  28. 28.
    Kuzmanov, G.K., Gaydadjiev, G.N., Vassiliadis, S.: The Virtex II Pro MOLEN processor. In: Proceedings of International Workshop on Computer Systems: Architectures, Modelling, and Simulation (SAMOS), Samos, Greece, July. LNCS, vol. 3133, pp. 192–202 (2004) CrossRefGoogle Scholar
  29. 29.
    Kuzmanov, G., Gaydadjiev, G.N., Vassiliadis, S.: The Molen media processor: design and evaluation. In: WASP’05 (2005) Google Scholar
  30. 30.
    Lam, Y.M., Coutinho, J.G.F., Luk, W., Leong, P.H.W.: Integrated hardware/software codesign for heterogeneous computing systems. In: Proceedings of the Southern Programmable Logic Conference, pp. 217–220 (2008) Google Scholar
  31. 31.
    Lam, Y.M., Coutinho, J.G.F., Ho, C.H., Leong, P.H.W., Luk, W.: Multi-loop parallelisation using unrolling and fission. Int. J. Reconfigurable Comput. 2010, 1–10 (2010). ISSN 1687-7195 CrossRefGoogle Scholar
  32. 32.
    Liu, Q., et al.: Optimising designs by combining model-based transformations and pattern-based transformations. In: Proceedings of International Conference on Field Programmable Logic and Applications (2009) Google Scholar
  33. 33.
    Luis, J.P., Carvalho, C.G., Delgado, J.C.: Parallelism extraction in acyclic code. In: Proceedings of the Fourth Euromicro Workshop on Parallel and Distributed Processing (PDP’96), Braga, January, pp. 437–447 (1996) CrossRefGoogle Scholar
  34. 34.
    Luk, W., Coutinho, J.G.F., Todman, T., Lam, Y.M., Osborne, W., Susanto, K.W., Liu, Q., Wong, W.S.: A high-level compilation toolchain for heterogeneous systems. In: Proceedings of IEEE International SOC Conference, September 2009 Google Scholar
  35. 35.
    Newburn, C.J., Shen, J.P.: Automatic partitioning of signal processing programs for symmetric multiprocessors. In: PACT 96 (1996) Google Scholar
  36. 36.
    Novillo, D.: A new optimization infrastructure for GCC. In: GCC Developers Summit, Ottawa (2003) Google Scholar
  37. 37.
    Novillo, D.: Design and implementation of tree SSA. In: GCC Developers Summit, Ottawa (2004) Google Scholar
  38. 38.
    Osborne, W.G., Coutinho, J.G.F., Luk, W., Mencer, O.: Power and branch aware word-length optimization. In: Proceedings of IEEE Symposium on Field-Programmable Custom Computing Machines, pp. 129–138. IEEE Comput. Soc., Los Alamitos (2008) Google Scholar
  39. 39.
    Ottoni, G., Rangan, R., Stoler, A., Bridges, M.J., August, D.I.: From sequential programs to concurrent threads. IEEE Comput. Archit. Lett. 5(1), 2 (2006) CrossRefGoogle Scholar
  40. 40.
    Pilato, C., Tumeo, A., Palermo, G., Ferrandi, F., Lanzi, P.L., Sciuto, D.: Improving evolutionary exploration to area-time optimization of FPGA designs. J. Syst. Archit., Embed. Syst. Design 54(11), 1046–1057 (2008) CrossRefGoogle Scholar
  41. 41.
    Qin, W., Malik, S.: Flexible and formal modeling of microprocessors with application to retargetable simulation. In: DATE 03: Proceedings of Conference on Design, Automation and Test in Europe, pp. 556–561 (2003) Google Scholar
  42. 42.
    Quinlan, D.J.: ROSE: compiler support for object-oriented frameworks. In: Proceedings of Conference on Parallel Compilers (CPC2000) (2000) Google Scholar
  43. 43.
    Ramalingam, G.: Identifying loops in almost linear time. ACM Trans. Program. Lang. Syst. 21(2), 175–188 (1999) MathSciNetCrossRefGoogle Scholar
  44. 44.
    Sarkar, V.: Partitioning and scheduling parallel programs for multiprocessors. In: Research Monographs in Parallel and Distributed Processing (1989) Google Scholar
  45. 45.
    Sato, M.: OpenMP: parallel programming API for shared memory multiprocessors and on-chip multiprocessors. In: Proceedings of ISSS’02, pp. 109–111 (2002) Google Scholar
  46. 46.
    Schirrmeister, F.: Cadence virtual component co-design: an environment for system design and its application to functional specification and architectural selection for automotive systems. Cadence Design Systems, Inc. (2000) Google Scholar
  47. 47.
    SUIF2 Compiler System: available online at http://suif.stanford.edu/suif/suif2
  48. 48.
    Susanto, K., Luk, W., Coutinho, J.G.F., Todman, T.J.: Design validation by symbolic simulation and equivalence checking: a case study in memory optimisation for image manipulation. In: Proceedings of 35th Conference on Current Trends in Theory and Practice of Computer Science (2009) Google Scholar
  49. 49.
    Todman, T., Liu, Q., Luk, W., Constantinides, G.A.: A scripting engine for combining design transformations. In: Proceedings of IEEE Symposium on Field-Programmable Custom Computing Machines (FCCM), US, May 2010 Google Scholar
  50. 50.
    Vallerio, K.S., Jha, N.K.: Task graph extraction for embedded system synthesis. In: Proceedings of the 16th International Conference on VLSI Design (VLSID 03), Washington, DC, USA, p. 480. IEEE Comput. Soc., Los Alamitos (2003) Google Scholar
  51. 51.
    Vassiliadis, S., Wong, S., Cotofana, S.: The MOLEN ρμ-coded processor. In: Proceedings of International Conference on Field-Programmable Logic and Applications (FPL). LNCS, vol. 2147, pp. 275–285. Springer, Berlin (2001) CrossRefGoogle Scholar
  52. 52.
    Vassiliadis, S., Wong, S., Gaydadjiev, G., Bertels, K.L.M., Kuzmanov, G.K., Moscu Panainte, E.: The MOLEN polymorphic processor. IEEE Trans. Comput. 53(11), 1363–1375 (2004) CrossRefGoogle Scholar
  53. 53.
    Wang, W.K.: Process scheduling using genetic algorithms. In: IEEE Symposium on Parallel and Distributed Processing, pp. 638–641 (1995) Google Scholar
  54. 54.
    Wiangtong, T., Cheung, P.Y.K., Luk, W.: Hardware/software codesign: a systematic approach targeting data-intensive applications. IEEE Signal Process. Mag. 22(3), 14–22 (2005) CrossRefGoogle Scholar
  55. 55.
    Xilinx Inc: Virtex-5 APU Floating-Point Unit v1.01a, April 2009 Google Scholar
  56. 56.
    Xilinx Inc: Floating-Point Operator v5.0, June 2009 Google Scholar
  57. 57.
    Xilinx Inc: XtremeDSP for Virtex-4 FPGAs, May 2008 Google Scholar
  58. 58.
    Xilinx Inc: IP Release Notes Guide v1.8, December 2009 Google Scholar
  59. 59.
    Xilinx, Inc: Virtex-4 FPGA Configuration User Guide, Xilinx user guide UG071 (2008) Google Scholar
  60. 60.
    Yang, T., Gerasoulis, A.: DSC: scheduling parallel tasks on an unbounded number of processors. IEEE Trans. Parallel Distrib. Syst. 5, 951–967 (1994) CrossRefGoogle Scholar
  61. 61.
    Yankova, Y., Bertels, K., Kuzmanov, G., Gaydadjiev, G., Lu, Y., Vassiliadis, S.: DWARV: DelftWorkBench automated reconfigurable VHDL generator. In: FPL’07 (2007) Google Scholar
  62. 62.
    Zhu, J., Gajski, D.D.: Compiling SpecC for simulation. In: ASP-DAC’01: Proceedings of the 2001 Asia and South Pacific Design Automation Conference, Yokohama, Japan, pp. 57–62 (2001) CrossRefGoogle Scholar

Copyright information

© Springer Science+Business Media B.V. 2012

Authors and Affiliations

  • Koen Bertels
    • 3
  • Ariano Lattanzi
    • Emanuele Ciavattini
      • Ferruccio Bettarelli
        • Maria Teresa Chiaradia
          • 1
        • Raffaele Nutricato
          • 1
        • Alberto Morea
          • 1
        • Anna Antola
          • 2
        • Fabrizio Ferrandi
          • 2
        • Marco Lattuada
          • 2
        • Christian Pilato
          • 2
        • Donatella Sciuto
          • 2
        • Roel J. Meeuws
          • 3
        • Yana Yankova
          • 3
        • Vlad Mihai Sima
          • 3
        • Kamana Sigdel
          • 3
        • Wayne Luk
          • 4
        • Jose Gabriel de Figueiredo Coutinho
          • 4
        • Yuet Ming Lam
          • 4
        • Tim Todman
          • 4
        • Andrea Michelotti
          • 5
        • Antonio Cerruto
          • 5
        1. 1.Leaff EngineeringOsimoItaly
        2. 2.Politecnico di BariBariItaly
        3. 3.Politecnico di MilanoMilanItaly
        4. 4.Fac. Electrical Engineering, Mathematics & Computer ScienceDelft University of TechnologyDelftThe Netherlands
        5. 5.Imperial CollegeLondonUK

        Personalised recommendations