Early Stage Automatic Strategy for Power-Aware Signal Processing Systems Design
- 232 Downloads
Complexity management, portability and long term adaptivity are common challenges in different fields of embedded systems, normally colliding with the needs of efficient resource utilization and power balance. Image/signal processing systems, though required to offer a large variety of complex functions, have also to deal with battery-life limitations. Wearable signal processing systems, for example, should provide high performance and support new generation standards without compromising their portability and their long-term usability. These constraints challenge hardware designers: early stage trade-off analysis and power management automated techniques are helpful to guarantee a reasonable time-to-market. In the field of video codec specifications, the MPEG standard known as Reconfigurable Video Coding (RVC) framework addresses functional complexity and adaptivity leveraging on the intrinsic modularity of the dataflow model of computation, but it still lacks in offering power management support. The main contribution of this work is providing an automatic early-stage power management methodology to be adopted within the MPEG-RVC context. Starting from different high-level specifications, our mapping methodology identifies directly on the high-level models disjointed homogeneous logic clock regions, where the platform resources can be enabled/disabled together without affecting the overall system performance. To extend its usability to the RVC community, we have integrated this methodology within the Multi-Dataflow Composer (MDC) tool. MDC is a tool for on-the-fly reconfigurable signal processing platforms deployment. In this paper, we extended MDC to address power-aware multi-context systems. To prove the effectiveness of our work, a coprocessor for image and video processing acceleration has been assembled. This latter has been synthesized on a 90 nm ASIC technology, where demonstrated up to 90 % of reduction in the dynamic power consumption on different dataflow-intensive applications. The coprocessor has been implemented also on FPGA, confirming, partially, the benefits of adopting the proposed methodology.
KeywordsSignal processing Dataflow MoC MPEG-RVC Coarse-grained reconfigurability Low-power Clock gating
Prof. Luigi Raffo and Dr. Carlo Sau are grateful to Sardinia Regional Government for funding the RPCT Project (L.R. 7/2007, CRP-18324) that led to these results. Dr. Sau is also grateful to Sardinia Regional Government for supporting his PhD scholarship (P.O.R. F.S.E., European Social Fund 2007-2013 - Axis IV Human Resources).
- 2.ISO/IEC 23001-4 (2009). MPEG-part 4: Codec configuration representation.Google Scholar
- 3.Open RVC-CAL compiler website. [Online]. http://orcc.sourceforge.net.
- 4.Bezati, E., Mattavelli, M., & Janneck, J. (2013). High-level synthesis of dataflow programs for signal processing systems. In International Symposium on Image and Signal Processing and Analysis, pp.750–754.Google Scholar
- 5.Brunet, S.C., Mattavelli, M., & Janneck, J.W. (2013). Turnus: a design exploration framework for dataflow system design. In International Symposium on Circuits and Systems, pp.654.Google Scholar
- 6.Palumbo, F., Carta, N., & Raffo, L. (2011). The Multi-Dataflow Composer tool: A runtime reconfigurable HDL platform composer. In Conference on Design and Architectures for Signal and Image Processing, pp.178–185.Google Scholar
- 7.Palumbo, F., Pani, D., Manca, E., Raffo, L., Mattavelli, M., & Roquier, G. (2010). RVC: A multi-decoder CAL composer tool. In Conference on Design and Architectures for Signal and Image Processing, pp.144–151.Google Scholar
- 8.Sau, C., Raffo, L., Palumbo, F., Bezati, E., Casale-Brunet, S., & Mattavelli, M. (2014). Automated Design Flow for Coarse-Grained Reconfigurable Platforms: an RVC-CAL Multi-Standard Decoder Use-Case. In International Conference on Embedded Computer Systems: Architectures, Modeling, and Simulation (SAMOS XIV).Google Scholar
- 10.Carta, N., Sau, C., Pani, D., Palumbo, F., & Raffo, L. (2013). A Coarse-Grained Reconfigurable Approach for Low-Power Spike Sorting Architectures. In IEEE/EMBS, International Conference on Neural Engineering, pp. 439–442.Google Scholar
- 11.Carta, N., Sau, C., Palumbo, F., Pani, D., & Raffo, L. (2013). A Coarse-Grained Reconfigurable Wavelet Denoiser Exploiting the Multi-Dataflow Composer Tool. In Conference on Design and Architectures for Signal and Image Processing, pp. 141– 148.Google Scholar
- 12.Bezati, E., Casale-Brunet, S., Mattavelli, M., & Janneck, J.W. (2014). Coarse grain clock gating of streaming applications in programmable logic implementations. In Electronic System Level Synthesis Conference (ESLsyn).Google Scholar
- 13.Tuveri, G., Secchi, S., Meloni, P., Raffo, L., & Cannella, E. (2013). A runtime adaptive H.264 video-decoding MPSoC platform. In Conference on Design and Architectures for Signal and Image Processing, pp.149–156.Google Scholar
- 15.Dennis, J.B. (1974). First version of a data flow procedure language. In Programming Symposium, Colloque sur la Programmation, pp. 362–376.Google Scholar
- 16.Kahn, G. (1974). The semantics of a simple language for parallel programming. In International Conference on Information Processing, pp. 471–475.Google Scholar
- 18.Lee, E.A., & Parks, T. (1995). Dataflow Process Networks. In Proceedings of the IEEE, pp.773–799.Google Scholar
- 19.Eker, J., & Janneck, J.W. (2003). Cal Language Report Specification of the Cal Actor Language, EECS Department, University of California, Berkeley, Tech. Rep.Google Scholar
- 21.Puri, R., Varma, D., Edwards, D., Weger, A.J., Franzon, P.D., Yang, A., & Kosonocky, S.V. (2008). Keeping Hot Chips Cool: Are IC Thermal Problems Hot Air?. In Proceedings of the 45th Design Automation Conference, DAC 2008, Anaheim, CA, USA, June 8-13, 2008, pp. 634–635. [Online]. doi: 10.1145/1391469.1391632.
- 22.Benini, L., & Micheli, G.d. (1998). Dynamic power management: design techniques and CAD tools: Kluwer Academic Publishers.Google Scholar
- 23.Pedram, M. (2002). Power Aware Design Methodologies: Kluwer Academic Publishers.Google Scholar
- 24.Zhang, Y., Roivainen, J., & Mammela, A. (2006). Clock-Gating in FPGAs: A Novel and Comparative Evaluation. In Conference on Digital System Design: Architectures, Methods and Tools, pp.584–590.Google Scholar
- 25.Casale Brunet, S., Bezati, E., Alberti, C., Mattavelli, M., Amaldi, E., & Janneck, J. (2013). Partitioning and optimization of high level stream applications for multi clock domain architectures. In IEEE Workshop on Signal Processing Systems, pp.177–182.Google Scholar
- 26.Wipliez, M., Siret, N., Carta, N., Palumbo, F., & Raffo, L. (2012). Design IP faster: Introducing the C ~high-level language. In IP-SOC: IP-Embedded System Conference and Exhibition.Google Scholar
- 27.Palumbo, F., Sau, C., & Raffo, L. (2013). DSE and profiling of multi-context coarse-grained reconfigurable systems. In International Symposium on Image and Signal Processing and Analysis, pp. 744–749.Google Scholar
- 28.Nezan, J.-F., Siret, N., Wipliez, M., Palumbo, F., & Raffo, L. (2012). Multi-purpose systems: A novel dataflow-based generation and mapping strategy. In ISCAS, pp.3073–3076.Google Scholar
- 29.Meloni, P., Loi, I., Angiolini, F., Carta, S.M., Barbaro, M., Raffo, L., & Benini, L. (2007). Area and power modeling for networks-on-chip with layout awareness, VLSI DESIGN, pp.1–12.Google Scholar
- 30.Sau, C., & Palumbo, F. (2014). Automatic Generation of Dataflow-Based Reconfigurable Co-processing Units. In Conference on Design and Architectures for Signal and Image Processing.Google Scholar