Skip to main content
Log in

An efficient GPU-based parallel tabu search algorithm for hardware/software co-design

  • Research Article
  • Published:
Frontiers of Computer Science Aims and scope Submit manuscript

Abstract

Hardware/software partitioning is an essential step in hardware/software co-design. For large size problems, it is difficult to consider both solution quality and time. This paper presents an efficient GPU-based parallel tabu search algorithm (GPTS) for HW/SW partitioning. A single GPU kernel of compacting neighborhood is proposed to reduce the amount of GPU global memory accesses theoretically. A kernel fusion strategy is further proposed to reduce the amount of GPU global memory accesses of GPTS. To further minimize the transfer overhead of GPTS between CPU and GPU, an optimized transfer strategy for GPU-based tabu evaluation is proposed, which considers that all the candidates do not satisfy the given constraint. Experiments show that GPTS outperforms state-of-the-art work of tabu search and is competitive with other methods for HW/SW partitioning. The proposed parallelization is significant when considering the ordinary GPU platform.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
EUR 32.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or Ebook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. De Michell G, Gupta R K. Hardware/software co-design. Proceedings of the IEEE, 1997, 85(3): 349–365

    Google Scholar 

  2. Wolf W. A decade of hardware/software co-design. Computer, 2003, 6(4): 38–43

    Google Scholar 

  3. Teich J. Hardware/software co-design: the past, the present, and predicting the future. Proceedings of the IEEE, 2012, 100: 1411–1430

    Google Scholar 

  4. Ouyang A, Peng X, Liu J, Sallam A. Hardware/software partitioning for heterogeneous MPSoC considering communication overhead. International Journal of Parallel Programming, 2017, 45(4): 899–922

    Google Scholar 

  5. Hou N, Yan X, He F. A survey on partitioning models, solution algorithms and algorithm parallelization for hardware/software co-design. Design Automation for Embedded Systems, 2019, 23(1–2): 57–77

    Google Scholar 

  6. Shi W, Wu J, Lam S, Srikanthan T. Algorithms for bi-objective multiple-choice hardware/software partitioning. Computers & Electrical Engineering, 2016, 50: 127–142

    Google Scholar 

  7. Dick R P, Rhodes D L, Wolf W. TGFF: task graphs for free. In: Proceedings of the 6th International Workshop on Hardware/Software Co-design. 1998, 97–101

  8. Henkel J, Ernst R. An approach to automated hardware/software partitioning using a flexible granularity that is driven by high-level estimation techniques. IEEE Transactions on Very Large Scale Integration Systems, 2001, 9(2): 273–289

    Google Scholar 

  9. Jiang G, Wu J, Lam S, Srikanthan T, Sun J. Algorithmic aspects of graph reduction for hardware/software partitioning. The Journal of Supercomputing, 2015, 71(6): 2251–2274

    Google Scholar 

  10. Arató P, Juhász S, Mann Z, Orbán A, Papp D. Hardware-software partitioning in embedded system design. In: Proceedings of IEEE International Symposium on Intelligent Signal Processing. 2003, 197–202

  11. Arató P, Mann Z, Orbán A. Algorithmic aspects of hardware/software partitioning. ACM Transactions on Design Automation of Electronic Systems, 2005, 10(1): 136–156

    Google Scholar 

  12. Zhou Y, He F, Hou N, Qiu Y. Parallel ant colony optimization on multi-core SIMD CPUs. Future Generation Computer Systems, 2018, 79(2): 473–487

    Google Scholar 

  13. Wang R, Hung W, Yang G, Song X. Uncertainty model for configurable hardware/software and resource partitioning. IEEE Transactions on Computers, 2016, 66(10): 3217–3223

    MathSciNet  MATH  Google Scholar 

  14. Yan X, He F, Hou N, Ai H. An efficient particle swarm optimization for large scale hardware/software co-design system. International Journal of Cooperative Information Systems, 2018, 27(1): 1741001

    Google Scholar 

  15. Trindade A, Cordeiro L. Applying SMT-based verification to hardware/software partitioning in embedded systems. Design Automation for Embedded Systems, 2016, 20(1): 1–19

    Google Scholar 

  16. Li H, He F, Yan X. IBEA-SVM: an indicator-based evolutionary algorithm based on pre-selection with classification guided by SVM. Applied Mathematics—A Journal of Chinese Universities, 2019, 34(1): 1–26

    MathSciNet  MATH  Google Scholar 

  17. Luo J, He F, Yong J. An efficient and robust bat algorithm with fusion of opposition-based learning and whale optimization algorithm. Intelligent Data Analysis, 2020, 24(3): 500–519

    Google Scholar 

  18. Yong J, He F, Li H, Zhou W. A novel bat algorithm based on cross boundary learning and uniform explosion strategy. Applied Mathematics—A Journal of Chinese Universities, 2019, DOI: https://doi.org/10.1007/s11766-019-3714-1

  19. Gupta R, Micheli G. Hardware-software co-synthesis for digital systems. IEEE Design & Test of Computers, 1993, 10(3): 29–41

    Google Scholar 

  20. Ernst R, Henkel J, Benner T. Hardware — software co-synthesis for microcontrollers. IEEE Design & Test of Computers, 1993, 10(4): 64–75

    Google Scholar 

  21. Dick R, Jha N. MOGAC: a multi-objective genetic algorithm for hardware-software co-synthesis of distributed embedded systems. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, 1998, 17(10): 920–935

    Google Scholar 

  22. Wang G, Gong W, Kastner R. Application partitioning on programmable platforms using the ant colony optimization. Journal of Embedded Computing, 2006, 2(1): 119–136

    Google Scholar 

  23. Ferrandi F, Lanzi P, Pilato C, Sciuto D, Tumeo A. Ant colony optimization for mapping, scheduling and placing in reconfigurable systems. In: Proceedings of IEEE NASA/ESA Conference on Adaptive Hardware and Systems. 2013, 47–54

  24. Koudil M, Benatchba K, Tarabet A. Using artificial bees to solve partitioning and scheduling problems in co-design. Applied Mathematics and Computation, 2007, 186(2): 1710–1722

    MathSciNet  MATH  Google Scholar 

  25. Abdelhalim M, Habib S. An integrated high-level hardware/software partitioning methodology. Design Automation for Embedded Systems, 2011, 15(1): 19–50

    Google Scholar 

  26. Garg K, Aung Y, Lam S. Knapsim-run-time efficient hardwaresoftware partitioning technique for FPGAs. In: Proceedings of the 28th IEEE International Conference on System-on-Chip. 2015, 64–69

  27. Zhang Y, Luo W, Zhang Z, Li B, Wang X. A hardware/software partitioning algorithm based on artificial immune principles. Applied Soft Computing, 2008, 8(1): 383–391

    Google Scholar 

  28. Jiang Y, Zhang H, Jiao X, Song X, Hung W, Gu M, Sun J. Uncertain model and algorithm for hardware/software partitioning. In: Proceedings of IEEE Computer Society Annual Symposium on VLSI. 2012, 243–248

  29. Li G, Feng J, Wang C, Wang J. Hardware/software partitioning algorithm based on the combination of genetic algorithm and tabu search. Engineering Review, 2014, 34(2): 151–160

    MathSciNet  Google Scholar 

  30. Yan X, He F, Chen Y. A novel hardware/software partitioning method based on position disturbed particle swarm optimization with invasive weed optimization. Journal of Computer Science and Technology, 2017, 32(2): 340–355

    MathSciNet  Google Scholar 

  31. Kalavade A, Subrahmanyam P. Hardware/software partitioning for multi-function systems. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, 1998, 17(9): 819–837

    Google Scholar 

  32. Govil N, Shrestha R, Chowdhury S. PGMA: an algorithmic approach for multi-objective hardware software partitioning. Microprocessors and Microsystems, 2017, 54: 83–96

    Google Scholar 

  33. Farahani A, Kamal M, Salmani-Jelodar M. Parallel genetic algorithm based HW/SW partitioning. In: Proceedings of International Symposium on Parallel Computing in Electrical Engineering. 2006, 337–342

  34. Wu Y, Zhang H, Yang H. Research on parallel HW/SW partitioning based on hybrid PSO algorithm. In: Proceedings of International Conference on Algorithms and Architectures for Parallel Processing. 2009, 449–459

  35. Pan Y, He F, Yu H, Li H. Learning adaptive trust strength with user roles of truster and trustee for trust-aware recommender systems. Applied Intelligence, 2019, DOI: https://doi.org/10.1007/s10489-019-01542-0

  36. Lv X, He F, Cai W, Cheng Y. An optimized RGA supporting selective undo for collaborative text editing systems. Journal of Parallel and Distributed Computing, 2019, 132: 310–330

    Google Scholar 

  37. Li K, He F, Yu H. Robust visual tracking based on convolutional features with illumination and occlusion handing. Journal of Computer Science and Technology, 2018, 33(1): 223–236

    Google Scholar 

  38. Yu H, He F, Pan Y. A novel region-based active contour model via local patch similarity measure for image segmentation. Multimedia Tools and Applications, 2018, 77(18): 24097–24119

    Google Scholar 

  39. Van Luong T, Melab N, Talbi E. GPU computing for parallel local search meta-heuristic algorithms. IEEE Transactions on Computers, 2013, 62(1): 173–185

    MathSciNet  MATH  Google Scholar 

  40. Zhou Y, He F, Qiu Y. Dynamic strategy based parallel ant colony optimization on GPUs for TSPs. Science China Information Sciences, 2017, 60(6): 068102.

    Google Scholar 

  41. Zhu W, Curry J, Marquez A. SIMD tabu search for the quadratic assignment problem with graphics hardware acceleration. International Journal of Production Research, 2010, 48(4): 1035–1047

    MATH  Google Scholar 

  42. Wei K, Sun X, Chu H, Wu C. Reconstructing permutation table to improve the tabu search for the PFSP on GPU. The Journal of Supercomputing, 2017, 73(11): 4711–4738

    Google Scholar 

  43. Bukata L, š˙cha P, Hanzálek Z. Solving the resource constrained project scheduling problem using the parallel tabu search designed for the CUDA platform. Journal of Parallel and Distributed Computing, 2015, 77: 58–68

    Google Scholar 

  44. Hou N, He F, Chen Y, Zhou Y. An adaptive neighborhood taboo search on GPU for hardware/software co-design. In: Proceedings of the 20th International Conference on Computer Supported Cooperative Work in Design. 2016, 239–244

  45. Hou N, He F, Zhou Y, Ai H. A GPU-based tabu search for very large hardware/software partitioning with limited resource usage. Journal of Advanced Mechanical Design, Systems, and Manufacturing, 2017, 11(5): JAMDSM0060

    Google Scholar 

  46. Wu J, Srikanthan T, Chen G. Algorithmic aspects of hardware/software partitioning: 1D search algorithms. IEEE Transactions on Computers, 2010, 59(4): 532–544

    MathSciNet  MATH  Google Scholar 

  47. Wu J, Wang P, Lam S, Srikanthan T. Efficient heuristic and tabu search for hardware/software partitioning. The Journal of Supercomputing, 2013, 66(1): 118–134

    Google Scholar 

  48. Chen Z, Wu J, Song G, Chen J. Noderank: an efficient algorithm for hardware/software partitioning. Chinese Journal of Computers, 2013, 36(10): 2033–2040

    Google Scholar 

  49. Quan H, Zhang T, Liu Q, Guo J, Wang X, Hu R. Comments on algorithmic aspects of hardware/software partitioning: 1D search algorithms. IEEE Transactions on Computers, 2014, 4(63): 1055–1056

    MathSciNet  MATH  Google Scholar 

  50. Billeter M, Olsson O, Assarsson U. Efficient stream compaction on wide SIMD many-core architectures. In: Proceedings of the Conference on High Performance Graphics. 2009, 159–166

  51. Wilt N. The Cuda Handbook: a Comprehensive Guide to GPU Programming. Pearson Education, 2013

  52. Gupta K, Stuart J, Owens J. A study of persistent threads style GPU programming for GPGPU workloads. In: Proceedings of Innovative Parallel Computing. 2012, 1–14

  53. Guthaus M, Ringenberg J, Ernst D, Austin T, Mudge T, Brown R. MiBench: a free, commercially representative embedded benchmark suite. In: Proceedings of IEEE International Workshop on Workload Characterization. 2001, 3–14

  54. Pan Y, He F, Yu H. A novel enhanced collaborative autoencoder with knowledge distillation for top-n recommender systems. Neurocomputing, 2019, 332: 137–148

    Google Scholar 

  55. Zhang S, He F, Ren W, Yao J. Joint learning of image detail and transmission map for single image dehazing. The Visual Compute, 2018, DOI: https://doi.org/10.1007/s00371-018-1612-9

  56. Chen X, He F, Yu H. A matting method based on full feature coverage. Multimedia Tools and Applications, 2019, 78(9): 11173–11201

    Google Scholar 

  57. Yu H, He F, Pan Y. A novel segmentation model for medical images with intensity inhomogeneity based on adaptive perturbation. Multimedia Tools and Applications, 2019, 78(9), 11779–11798

    Google Scholar 

  58. Fang F, Yi M, Feng, H, Hu S, Xiao C. Narrative collage of I mage collections by scene graph recombination. IEEE Transactions on Visualization and Computer Graphics, 2018, 24(9): 2559–2572

    Google Scholar 

  59. Wu Y, He F, Zhang D, Li X. Service-oriented feature-based data exchange for cloud-based design and manufacturing. IEEE Transactions on Services Computing, 2018, 11(2): 341–353

    Google Scholar 

  60. Pan Y, He F, Yu H. A correlative denoising autoencoder to model social influence for top-N recommender system. Frontiers of Computer Science, 2020, 14(3): 143301

    Google Scholar 

  61. Lv X, He F, Yan X, Wu Y, Cheng Y. Integrating selective undo of feature-based modeling operations for real-time collaborative CAD systems. Future Generation Computer Systems, 2019, 100: 473–497

    Google Scholar 

  62. Li K, He F, Yu H, Chen X. A parallel and robust object tracking approach synthesizing adaptive Bayesian learning and improved incremental subspace learning. Frontiers of Computer Science, 2019, 13(5): 1116–1135

    Google Scholar 

  63. Yang L, Yan Q, Fu Y, Xiao C. Surface reconstruction via fusing sparse-sequence of depth images. IEEE Transactions on Visualization and Computer Graphics, 2018, 24 (2): 1190–1203

    Google Scholar 

Download references

Acknowledgements

This paper was supported by the National Natural Science Foundation of China (Grant No. 61472289), National Key Research and Development Project (2016YFC0106305). We also would like to thank the anonymous reviewers for their valuable and constructive comments.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Fazhi He.

Additional information

Neng Hou received PhD degree from Wuhan University, China in 2018. He is currently a lecture at School of Computer Science in Yangtze University, China. His research interests include HW/SW Co-design and GPU computing.

Fazhi He is currently a professor at School of Computer Science in Wuhan University, China. His research interests include computer-aided design, computer graphics, image processing, intelligent computing.

Yi Zhou is currently a lecture at School of Information Science and Engineering in Wuhan University of Science and Technology, China. His research interests include multi-core CPU and Many-core GPU based metaheuristics.

Yilin Chen is currently a PhD candidate at the School of Computer Science in Wuhan University, China. His research interests include GPGPU in computer graphics.

Electronic supplementary material

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Hou, N., He, F., Zhou, Y. et al. An efficient GPU-based parallel tabu search algorithm for hardware/software co-design. Front. Comput. Sci. 14, 145316 (2020). https://doi.org/10.1007/s11704-019-8184-3

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s11704-019-8184-3

Keywords

Navigation