Abstract
Considering the evolution towards highly variable data flow applications based on an increasing impact of dynamic actors, we must target at runtime the best matching between dataflow graphs and heterogeneous multiprocessor platforms. Thus the mapping must be dynamically adapted depending on data and on communication loads between the computation cores. This is typically the case for mobile devices that run multimedia applications. The problem of mapping a dataflow application, e.g. a network of computational actors, on a multiprocessor platform can be modeled as a problem of partitioning where the cells are the dataflow actors and the partitions are the processors. While the benefit of executing a computational part by one processor rather than another one is usually well shown, the migration overhead is also usually not considered. This paper presents a dynamic mapping algorithm that is performed at runtime, based on a single-move possibility that jointly considers the cost and benefit of possible migrations. The method is first applied on a set of randomly generated benchmarks with different features and different scenarios. Then it is applied to a MPEG4 simple profile video decoder with different input sequences. The results systematically show that the runtime mapping significantly improves the initial mapping. It is fast enough to be executed at runtime in order to track the best mapping according to data variations. The other observation is that not considering the migration cost of the new mapping could lead to worst performance than the original one.
Similar content being viewed by others
Notes
the first mapping is obtained by the RMI algorithm
References
Agarwal, A., Hennessy, J., & Horowitz, M. (1989). An analytical cache model. ACM Transactions on Computer Systems, 7(2), 184–215.
Bhattacharyya, S.S., Eker, J., Janneck, J.W., Lucarz, C., Mattavelli, M., & Raulet, M. (2011). Overview of the MPEG reconfigurable video coding framework. Journal Signal Processing System, 63(2), 251–263.
Castrillon, J., Leupers, R., & Ascheid, G. (2013). MAPS: Mapping Concurrent Dataflow Applications to Heterogeneous MPSoCs. IEEE Transactions on Industrial Informatics, 9(1), 527– 545.
Castrillon, J., Tretter, A., Leupers, R., & Ascheid, G. (2012). Communication-aware mapping of KPN applications onto heterogeneous MPSoCs. In Proceedings of the 49th annual design automation conference, DAC ’12 (pp. 1266–1271). New York: ACM.
Cho, Y.S., Choi, E.J., & Cho, K.R. (2006). Modeling and analysis of the system bus latency on the SoC platform. In Proceedings of the 2006 International workshop on system-level interconnect prediction, SLIP ’06 (pp. 67–74). New York: ACM .
Derin, O., Kabakci, D., & Fiorin, L. (2011). Online task remapping strategies for fault-tolerant network-on-chip multiprocessors. In Networks on Chip (NoCS), 2011 Fifth IEEE/ACM International Symposium on (pp. 129–136).
Eker, J., & Janneck, J.W. (2003). Janneck j: Cal language report: Specification of the cal actor language. Tech. rep. Berkeley: University of California.
Foroutan, S., Thonnart, Y., & Petrot, F. (2013). An iterative computational technique for performance evaluation of networks-on-chip. IEEE Transactions on Computers, 62(8), 1641– 1655.
Kaushik, S., Singh, A., & Srikanthan, T. (2011). Computation and communication aware run-time mapping for NoC-based MPSoC platforms. In SOC Conference (SOCC), 2011 IEEE International (pp. 185–190).
Lee, C., Kim, H., Park, H.w., Kim, S., Oh, H., & Ha, S. (2010). A task remapping technique for reliable multi-core embedded systems. In Proceedings of the Eighth IEEE/ACM/IFIP International Conference on Hardware/Software Codesign and System Synthesis, CODES/ISSS ’10 (pp. 307–316). New York: ACM.
Lee, C., Kim, S., & Ha, S. (2013). Efficient run-time resource management of a manycore accelerator for stream-based applications. In Embedded systems for real-time multimedia (ESTIMedia), 2013 IEEE 11th symposium on (pp. 51–60).
Lim, S.K. (2008). Practical Problems in VLSI Physical Design Automation, 1st edn.: Springer.
Lin, J., Gerstlauer, A., & Evans, B.L. (2012). Communication-aware heterogeneous multiprocessor mapping for real-time streaming systems. Journal of Signal Processing Systems, 69(3), 279–291.
Liu, W., Yuan, M., He, X., Gu, Z., & Liu, X. (2008). Efficient sat-based mapping and scheduling of homogeneous synchronous dataflow graphs for throughput optimization. In Real-time systems symposium.
Ngo, D.T., Diguet, J.P., Martin, K., & Sepulveda, D. (2014). Communication-model based embedded mapping of dataflow actors on heterogeneous MPSoC. In Proceedings of the conference on design and architectures for signal and image processing (DASIP).
Nollet, V., Avasare, P., Eeckhaut, H., Verkest, D., & Corporaal, H. (2008). Run-Time Management of a MPSoC Containing FPGA Fabric Tiles. IEEE Transactions on Very Large Scale Integration (VLSI) Systems, 16 (1), 24–33.
ORCC. The Open RVC-CAL Compiler : A Development Framework for Dataflow Programs. http://orcc.sourceforge.net.
Pelcat, M., Nezan, J.F., Piat, J., Croizer, J., & Aridhi, S. (2009). A system-level architecture model for rapid prototyping of heterogeneous multicore embedded systems. In Conference on design and architectures for signal and image processing (DASIP) (p. 8). Nice, France.
Pittau, M., Alimonda, A., Carta, S., & Acquaviva, A. (2007). Impact of task migration on streaming multimedia for embedded multiprocessors: A quantitative evaluation. In Embedded systems for real-time multimedia, 2007. ESTIMedia 2007. IEEE/ACM/IFIP Workshop on, (pp. 59–64).
Quan, W., & Pimentel, A.D. (2013). A scenario-based run-time task mapping algorithm for MPSoCs. In Proceedings of the 50th annual design automation conference, DAC ’13 (pp. 131:1–131:6). New York: ACM.
Quan, W., & Pimentel, A.D. (2015). A Hybrid Task Mapping Algorithm for Heterogeneous MPSoCs. ACM Transactions on Embedded Computing Systems, 14(1), 14:1–14:25.
Sahu, P.K., & Chattopadhyay, S. (2013). A survey on application mapping strategies for network-on-chip design. Journal of Systems Architecture, 59(1), 60–76.
Sanchis, L.A. (1989). Multiple-way network partitioning. IEEE Transactions on Computers, 38(1), 62–81.
Schor, L., Bacivarov, I., Rai, D., Yang, H., Kang, S.H., & Thiele, L. (2012). Scenario-based design flow for mapping streaming applications onto on-chip many-core systems. In Proceedings of the 2012 International conference on compilers, architectures and synthesis for embedded systems, CASES ’12. New York: ACM.
Schranzhofer, A., Chen, J.J., Santinelli, L., & Thiele, L. (2010). Dynamic and adaptive allocation of applications on MPSoC platforms. In Design automation conference (ASP-DAC), 2010 15th Asia and South Pacific (pp. 885–890).
Singh, A., Shafique, M., Kumar, A., & Henkel, J. (2013). Mapping on multi/many-core systems: Survey of current and emerging trends. In 2013 50th ACM / EDAC / IEEE Design Automation Conference (DAC) (pp. 1–10).
Singh, A.K., Kumar, A., & Srikanthan, T. (2013). Accelerating throughput-aware runtime mapping for heterogeneous MPSoCs. ACM Transactions on Design Automation of Electronic Systems, 18(1).
de Souza Carvalho, E., Calazans, N., & Moraes, F. (2010). Dynamic task mapping for MPSoCs. IEEE Design Test of Computers, 27(5), 26–35.
Stuijk, S., Geilen, M., & Basten, T. (2006). Exploring trade-offs in buffer requirements and throughput constraints for synchronous dataflow graphs. In 43rd ACM/IEEE DAC(pp. 899–904).
Stuijk, S., Geilen, M., & Basten, T. (2006). Sdf3: Sdf for free. In 6th Int. Conf. on Application of Concurrency to System Design.
Stuijk, S., Geilen, M., Theelen, B., & Basten, T. (2011). Scenario-aware dataflow: Modeling, analysis and implementation of dynamic applications. In International conference on embedded computer systems (SAMOS).
Wipliez, M., & Raulet, M. (2012). Classification of dataflow actors with satisfiability and abstract interpretation. IJERTCS, 3(1), 49–69.
Yviquel, H. (2013). From dataflow-based video coding tools to dedicated embedded multi-core platforms. Ph.D. thesis, University of Rennes 1.
Yviquel, H., Casseau, E., Raulet, M., Jääskeläinen, P., & Takala, J. (2013). Towards run-time actor mapping of dynamic dataflow programs onto multi-core platforms. In International symposium on image and signal processing and analysis (ISPA). France.
Yviquel, H., Casseau, E., Wipliez, M., Gorin, J., & Raulet, M. (2014). Classification-based optimization of dynamic dataflow programs. In Advancing Embedded Systems and Real-Time Communications with Emerging Technologies (pp. 282–301). IGI Global.
Yviquel, H., Lorence, A., Jerbi, K., Cocherel, G., Sanchez, A., & Raulet, M. (2013). Orcc: Multimedia development made easy. In Proceedings of the 21st ACM International Conference on Multimedia, MM ’13 (pp. 863–866). New York: ACM.
Yviquel, H., Sanchez, A., Jskelinen, P., Takala, J., Raulet, M., & Casseau, E. (2014). Embedded multi-core systems dedicated to dynamic dataflow programs. Journal of Signal Processing Systems, 1–16.
Acknowledgments
This work was supported by the French ANR project COMPA 2011-INSE-01203 (http://compa-project.org) in collaboration with IETR, IRISA, and CAPS Entreprise. We would like to give special thanks to the Orcc and S-LAM communities for actively participating in the development of the tools which offer solid basements to this work.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Ngo, T.D., Martin, K.J.M. & Diguet, JP. Move Based Algorithm for Runtime Mapping of Dataflow Actors on Heterogeneous MPSoCs. J Sign Process Syst 87, 63–80 (2017). https://doi.org/10.1007/s11265-015-1088-z
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11265-015-1088-z