Journal of Signal Processing Systems

, Volume 83, Issue 3, pp 309–328 | Cite as

Scheduling of Parallelized Synchronous Dataflow Actors for Multicore Signal Processing

  • Zheng ZhouEmail author
  • William Plishker
  • Shuvra S. Bhattacharyya
  • Karol Desnos
  • Maxime Pelcat
  • Jean-Francois Nezan


Parallelization of Digital Signal Processing (DSP) software is an important trend in Multiprocessor System-on-Chip (MPSoC) implementation. The performance of DSP systems composed of parallelized computations depends on the scheduling technique, which must in general allocate computation and communication resources for competing tasks, and ensure that data dependencies are satisfied. In this paper, we formulate a new type of parallel task scheduling problem called Parallel Actor Scheduling (PAS) for MPSoC mapping of DSP systems that are represented as Synchronous Dataflow (SDF) graphs. In contrast to traditional SDF-based scheduling techniques, which focus on exploiting graph level (inter-actor) parallelism, the PAS problem targets the integrated exploitation of both intra- and inter-actor parallelism for platforms in which individual actors can be parallelized across multiple processing units. We first address a special case of the PAS problem in which all of the actors in the DSP application or subsystem being optimized are parallel actors (i.e., they can be parallelized to exploit multiple cores). For this special case, we develop and experimentally evaluate a two-phase scheduling framework with three work flows that involve particle swarm optimization (PSO) — PSO with a mixed integer programming formulation, PSO with simulated annealing, and PSO with a fast heuristic based on list scheduling. Then, we extend our scheduling framework to support the general PAS problem, which considers both parallel actors and sequential actors (actors that cannot be parallelized) in an integrated manner. We demonstrate that our PAS-targeted scheduling framework provides a useful range of trade-offs between synthesis time requirements and the quality of the derived solutions. We also demonstrate the performance of our scheduling framework from two aspects: simulations on a diverse set of randomly generated SDF graphs, and implementations of an image processing application and a software defined radio benchmark on a state-of-the-art multicore DSP platform.


Multicore processors Digital signal processors Synchronous dataflow Dataflow modeling Software synthesis 


  1. 1.
    Blossom, E. (2004). GNU radio: tools for exploring the radio frequency spectrum. Linux Journal, 2004(122), 4.Google Scholar
  2. 2.
    Dagum, L., & Menon, R. (1998). OpenMP: an industry standard API for shared-memory programming . IEEE Computational Science & Engineering, 5(1), 46–55.CrossRefGoogle Scholar
  3. 3.
    De Micheli, G. (1994). Synthesis and Optimization of Digital Circuits. New York: McGraw-Hill.Google Scholar
  4. 4.
    Dogramaci, A., & Surkis, J. (1979). Evaluation of a heuristic for scheduling independent jobs on parallel identical processors. Management Science, 25(12), 1208–1216.CrossRefzbMATHGoogle Scholar
  5. 5.
    Du, J., & Leung, J.Y. (1989). Complexity of scheduling parallel task systems. SIAM Journal of Discrete Mathematics, 2(4), 473–487.MathSciNetCrossRefzbMATHGoogle Scholar
  6. 6.
    El-Rewini, H., Lewis, T.G. , Ali, H.H. (1994). Task Scheduling in Parallel and Distributed Systems. Englewood Cliffs: Prentice Hall.Google Scholar
  7. 7.
    Falk, J., Zebelein, C., Haubelt, C., Teich, J. (2013). A rule-based quasi-static scheduling approach for static islands in dynamic dataflow graphs, 12(3).Google Scholar
  8. 8.
    Gepner, P., & Kowalik, M.F. (2006). Multi-core processors: New way to achieve high system performance. In: Proceedings of the International Symposium on Parallel Computing in Electrical Engineering, pp. 9–13.Google Scholar
  9. 9.
    Giaro, K., Kubale, M., Obszarski, P. (2009). A graph coloring approach to scheduling of multiprocessor tasks on dedicated machines with availability constraints. Discrete Applied Mathematics, 157(17), 3625–3630.MathSciNetCrossRefzbMATHGoogle Scholar
  10. 10.
    Hsu, C., Ko, M., Bhattacharyya, S.S. (2005). Software synthesis from the dataflow interchange format. In: Proceedings of the International Workshop on Software and Compilers for Embedded Systems, (pp. 37–49). Texas: Dallas.Google Scholar
  11. 11.
    Jansen, K., & Porkolab, L. (2000). Preemptive parallel task scheduling in O(n) + Poly(m) time. In: G. Goos, J. Hartmanis, J. Leeuwen, D.T. Lee, S. Teng (Eds.) Algorithms and Computation, Lecture Notes in Computer Science, (pp. 398–409). Berlin Heidelberg: Springer.Google Scholar
  12. 12.
    Kasim, H., Marchu, V., Zhang, R., See, S. (2008). Survey on parallel programming model. In: J. Cao, M. Li , M. Wu , J.Chen (Eds.) Network and Parallel Computing, Lecture Notes in Computer Science, vol. 5245, (pp. 266–275). Berlin Heidelberg: Springer.Google Scholar
  13. 13.
    Kennedy, J., & Eberhart, R.C. (1995). Particle swarm optimization. In: Proceedings of the IEEE International Conference on Neural Networks, (pp. 1942–1948).Google Scholar
  14. 14.
    Lee, E.A., & Messerschmitt, D.G. (1987). Synchronous dataflow. Proceedings of the IEEE, 75(9), 1235–1245.CrossRefGoogle Scholar
  15. 15.
    Lee, E.A., & Parks, T.M. (1995). Dataflow process networks. Proceedings of the IEEE, 773–799.Google Scholar
  16. 16.
    Lucke, L.E., Brown, A.P., Parhi, K.K. (1991). Unfolding and retiming for high-level dsp synthesis. In: Proceedings of the International Symposium on Circuits and Systems, (pp. 2351–2354).Google Scholar
  17. 17.
    Manaa, A., & Chu, C. (2010). Scheduling multiprocessor tasks to minimise the makespan on two dedicated processors. European Journal of Industrial Engineering, 4(3), 265–279.CrossRefGoogle Scholar
  18. 18.
    Murata, H., Fujiyoshi, K., Nakatake, S., Kajitani, Y. (1996). Vlsi module placement based on rectangle-packing by the sequence-pair. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, 15(12), 1518–1524.CrossRefGoogle Scholar
  19. 19.
    Nahapetian, A., Brisk, P., Ghiasi, S., Sarrafzadeh, M. (2009). An approximation algorithm for scheduling on heterogeneous reconfigurable resources. ACM Transactions on Embedded Computing Systems, 9(1).Google Scholar
  20. 20.
    Nichols, B., Buttlar, D., Farrell, J.P. (1996). Pthreads Programming: A POSIX Standard for Better Multiprocessing. Chicago: O’Reilly &amp Associates, Inc.Google Scholar
  21. 21.
    Omara, F.A. , & Arafa, M.M. (2010). Genetic algorithms for task scheduling problem . Journal of Parallel and Distributed Computing, 70 (1), 13–22.CrossRefzbMATHGoogle Scholar
  22. 22.
    Pelcat, M., Menuet, P., Aridhi, S., Nezan, J.F. (2009). Scalable compile-time scheduler for multi-core architectures. In: Proceedings of the Design, Automation and Test in Europe Conference and Exhibition, (pp. 1552–1555).Google Scholar
  23. 23.
    Plishker, W., Sane, N., Bhattacharyya, S.S. (2009). A generalized scheduling approach for dynamic dataflow applications. In: Proceedings of the Design, Automation and Test in Europe Conference and Exhibition, (pp. 111–116). France: Nice.Google Scholar
  24. 24.
    Rutenbar, R.A. (1989). Simulated annealing algorithms: an overview. IEEE Circuits and Devices Magazine, 5(1), 19–26.CrossRefGoogle Scholar
  25. 25.
    Sesia, S., Toufik, I., Baker, M. (2011). LTE — The UMTS Long Term Evolution: From Theory to Practice. New York: Wiley.CrossRefGoogle Scholar
  26. 26.
    Shen, C, Wu, H., Sane, N. , Plishker, W. , Bhattacharyya, S.S. (2011). A design tool for efficient mapping of multimedia applications onto heterogeneous platforms.. In: Proceedings of the IEEE International Conference on Multimedia and Expo. Barcelona, Spain. 6 pages in online proceedings.Google Scholar
  27. 27.
    Sriram, S., & Bhattacharyya, S.S. (2009). Embedded Multiprocessors: Scheduling and Synchronization, 2nd edn. Boca Raton,: CRC Press. ISBN:1420048015.CrossRefGoogle Scholar
  28. 28.
    Sullivan, G.J., Ohm, J., Han, W., Wiegand, T. (2012). Overview of the high efficiency video coding (HEVC) standard. IEEE Transactions on Circuits and Systems for Video Technology, 22(12), 1649–1668.CrossRefGoogle Scholar
  29. 29.
    Texas Instruments, Inc. (2012). In: TMS320C6678 Multicore Fixed and Floating-Point Digital Signal Processor Data Manual .Google Scholar
  30. 30.
    Wu, M., & Gajski, D. (1990). Hypertool: a programming aid for message-passing systems. IEEE Transactions on Parallel and Distributed Systems, 1(3), 330–343.CrossRefGoogle Scholar
  31. 31.
    Wu, S. (2011). Representation and scheduling of scalable dataflow graph topologies. Master’s thesis, Department of Electrical and Computer Engineering, University of Maryland, College Park.Google Scholar
  32. 32.
    Young, E.F.Y., Chu, C.C.N., Ho, M.L. (2004). Placement constraints in floorplan design. IEEE Transactions on Very Large Scale Integration (VLSI) Systems, 12(7), 735–745.CrossRefGoogle Scholar
  33. 33.
    Zaki, G. (2013). Scalable techniques for scheduling and mapping DSP applications onto embedded multiprocessor platforms. Ph.D. thesis, Department of Electrical and Computer Engineering, University of Maryland, College Park.Google Scholar
  34. 34.
    Zaki, G., Plishker, W., Bhattacharyya, S., Clancy, C., Kuykendall, J. (2011). Vectorization and mapping of software defined radio applications on heterogeneous multi-processor platforms. In: Proceedings of the IEEE Workshop on Signal Processing Systems, (pp. 31–36). Lebanon: Beirut .Google Scholar
  35. 35.
    Zhou, Z., Desnos, K., Pelcat, M., Nezan, J., Plishker, W., Bhattacharyya, S.S. (2013). Scheduling of parallelized synchronous dataflow actors, (pp. 1–10). Finland: Tampere. URL Scholar
  36. 36.
    Zhou, Z., Shen, C., Plishker, W., Wu, H., Bhattacharyya, S.S. (2012). Systematic integration of flowgraph- and module-level parallelism in implementation of DSP applications on multiprocessor systems-on-chip. In: Proceedings of the International Conference on Signal Processing. pp. 402–408. China: Beijing. URL

Copyright information

© Springer Science+Business Media New York 2014

Authors and Affiliations

  • Zheng Zhou
    • 1
    Email author
  • William Plishker
    • 2
  • Shuvra S. Bhattacharyya
    • 2
  • Karol Desnos
    • 3
  • Maxime Pelcat
    • 3
  • Jean-Francois Nezan
    • 3
  1. 1.Texas InstrumentsGermantownUSA
  2. 2.Department of ECE and UMIACSUniversity of MarylandCollege ParkUSA
  3. 3.IETR, INSA Rennes, CNRS UMR 6164, UEBRennesFrance

Personalised recommendations