The Journal of Supercomputing

, Volume 74, Issue 6, pp 2841–2869 | Cite as

Parallel synchronous and asynchronous coupled simulated annealing

  • Kayo Gonçalves-e-SilvaEmail author
  • Daniel Aloise
  • Samuel Xavier-de-Souza


We propose a parallel synchronous and asynchronous implementation of the coupled simulated annealing (CSA) algorithm in a shared-memory architecture. The original CSA was implemented synchronously in a distributed-memory architecture. It synchronizes at each temperature update, which leads to idling and loss of efficiency when increasing the number of processors. The proposed synchronous CSA (SCSA) is implemented as the original, but in a shared-memory architecture. The proposed asynchronous CSA (ACSA) does not synchronize, allowing a larger parallel efficiency for larger numbers of processors. Results from extensive experiments show that the proposed ACSA presents much better quality of solution when compared to the serial and to the SCSA. The experiments also show that the performance of the proposed ACSA is better than the SCSA for less computationally intensive problems or when a larger number of processing cores are available. Moreover, the parallel efficiency of the ACSA improves by increasing the size of the problem. With the advent of the Multi-core Era, the use of the proposed algorithm becomes more attractive than the original synchronous CSA.


Coupled simulated annealing Global optimization Parallel algorithms Parallel efficiency 



This research was supported by NPAD/UFRN.


  1. 1.
    Alba H, Luque G, Nesmachnow S (2013) Parallel metaheuristics: recent advances and new trends. Int Trans Oper Res 20:1–48CrossRefzbMATHGoogle Scholar
  2. 2.
    Ament M, Knittel G, Weiskopf D, Strasser W (2010) A parallel preconditioned conjugate gradient solver for the Poisson problem on a multi-GPU platform. In: 2010 18th Euromicro International Conference on Parallel, Distributed and Network-Based Processing (PDP). IEEE, pp 583–592Google Scholar
  3. 3.
    Coelho I, Munhoz P, Ochi L, Souza M, Bentes C, Farias R (2016) An integrated CPU-GPU heuristic inspired on variable neighbourhood search for the single vehicle routing problem with deliveries and selective pickups. Int J Prod Res 54(4):945–962CrossRefGoogle Scholar
  4. 4.
    Crainic TG, Toulouse M (2010) Parallel meta-heuristics. In: Handbook of metaheuristics. Springer, pp 497–541Google Scholar
  5. 5.
    Delévacq A, Delisle P, Gravel M, Krajecki M (2013) Parallel ant colony optimization on graphics processing units. J Parallel Distrib Comput 73(1):52–61. CrossRefGoogle Scholar
  6. 6.
    Ding K, Zheng S, Tan Y (2013) A GPU-based parallel fireworks algorithm for optimization. In: Proceedings of the 15th Annual Conference on Genetic and Evolutionary Computation. ACM, pp 9–16Google Scholar
  7. 7.
    Hemmati-Sarapardeh A, Shokrollahi A, Tatar A, Gharagheizi F, Mohammadi AH, Naseri A (2014) Reservoir oil viscosity determination using a rigorous approach. Fuel 116:39–48. CrossRefGoogle Scholar
  8. 8.
    Hong B, He Z (2011) An asynchronous multithreaded algorithm for the maximum network flow problem with nonblocking global relabeling heuristic. IEEE Trans Parallel Distrib Syst 22(6):1025–1033. CrossRefGoogle Scholar
  9. 9.
    Iturriaga S, Nesmachnow S, Luna F, Alba E (2015) A parallel local search in CPU/GPU for scheduling independent tasks on large heterogeneous computing systems. J Supercomput 71(2):648–672CrossRefGoogle Scholar
  10. 10.
    Kadjo D, Ayoub R, Kishinevsky M, Gratz PV (2015) A control-theoretic approach for energy efficient CPU–GPU subsystem in mobile platforms. In: Proceedings of the 52nd Annual Design Automation Conference. ACM, p 62Google Scholar
  11. 11.
    Kamari A, Hemmati-Sarapardeh A, Mirabbasi SM, Nikookar M, Mohammadi AH (2013) Prediction of sour gas compressibility factor using an intelligent approach. Fuel Process Technol 116:209–216. CrossRefGoogle Scholar
  12. 12.
    Kennedy J, Eberhart R (1995) Particle swarm optimization. In: IEEE International Conference on Neural Networks. Proceedings, vol 4, pp 1942–1948.
  13. 13.
    Kider JT, Henderson M, Likhachev M, Safonova A (2010) High-dimensional planning on the GPU. In: 2010 IEEE International Conference on Robotics and Automation (ICRA). IEEE, pp 2515–2522Google Scholar
  14. 14.
    Kirkpatrick S, Gelatt CD, Vecchi MP (1983) Optimization by simulated annealing. Science 220(4598):671–680MathSciNetCrossRefzbMATHGoogle Scholar
  15. 15.
    Liao TW (2010) Two hybrid differential evolution algorithms for engineering design optimization. Appl Soft Comput 10(4):1188–1199. CrossRefGoogle Scholar
  16. 16.
    Liepins GE, Hilliard MR (1989) Genetic algorithms: foundations and applications. Ann Oper Res 21(1):31–57. CrossRefzbMATHGoogle Scholar
  17. 17.
    Liu CM, Wong T, Wu E, Luo R, Yiu SM, Li Y, Wang B, Yu C, Chu X, Zhao K et al (2012) Soap3: ultra-fast GPU-based parallel alignment tool for short reads. Bioinformatics 28(6):878–879CrossRefGoogle Scholar
  18. 18.
    Liu YY, Wang S (2015) A scalable parallel genetic algorithm for the generalized assignment problem. Parallel Comput 46:98–119. MathSciNetCrossRefGoogle Scholar
  19. 19.
    Lou Z, Reinitz J (2016) Parallel simulated annealing using an adaptive resampling interval. Parallel Comput 53:23–31. MathSciNetCrossRefGoogle Scholar
  20. 20.
    Luong TV, Melab N, Talbi EG (2013) GPU computing for parallel local search metaheuristic algorithms. IEEE Trans Comput 62(1):173–185. MathSciNetCrossRefzbMATHGoogle Scholar
  21. 21.
    Mahmoodi NM, Arabloo M, Abdi J (2014) Laccase immobilized manganese ferrite nanoparticle: synthesis and LSSVM intelligent modeling of decolorization. Water Res 67:216–226. CrossRefGoogle Scholar
  22. 22.
    Mehrkanoon S, Alzate C, Mall R, Langone R, Suykens J (2015) Multiclass semisupervised learning based upon kernel spectral clustering. IEEE Trans Neural Netw Learn Syst 26(4):720–733. MathSciNetCrossRefGoogle Scholar
  23. 23.
    Olenšek J, Tuma T, Puhan J, Burmen Á (2011) A new asynchronous parallel global optimization method based on simulated annealing and differential evolution. Appl Soft Comput 11(1):1481–1489. CrossRefGoogle Scholar
  24. 24.
    Open MPI Documentation: Open MPI v3.0.0 (2018).
  25. 25.
    OpenMP Architecture Review Board: OpenMP application program interface version 4.0 (2013).
  26. 26.
    Rafiee-Taghanaki S, Arabloo M, Chamkalani A, Amani M, Zargari MH, Adelzadeh MR (2013) Implementation of SVM framework to estimate PVT properties of reservoir oil. Fluid Phase Equilib 346:25–32. CrossRefGoogle Scholar
  27. 27.
    Rucinski M, Izzo D, Biscani F (2010) On the impact of the migration topology on the island model. Parallel Comput 36(10–11):555–571. CrossRefzbMATHGoogle Scholar
  28. 28.
    Santander-Jiménez S, Vega-Rodríguez MA (2015) Parallel multiobjective metaheuristics for inferring phylogenies on multicore clusters. IEEE Trans Parallel Distrib Syst 26(6):1678–1692. CrossRefGoogle Scholar
  29. 29.
    Souza DS, Santos HG, Coelho IM (2017) A hybrid heuristic in GPU–CPU based on scatter search for the generalized assignment problem. Procedia Comput Sci 108:1404–1413CrossRefGoogle Scholar
  30. 30.
    de Souza SX (2007) Optimisation and robustness of cellular neural networks. Ph.D. thesis, Katholieke Universiteit Leuven, BelgiumGoogle Scholar
  31. 31.
    Subramanian A, Drummond L, Bentes C, Ochi L, Farias R (2010) A parallel heuristic for the vehicle routing problem with simultaneous pickup and delivery. Comput Oper Res 37(11):1899–1911. CrossRefzbMATHGoogle Scholar
  32. 32.
    Tang K, Li X, Suganthan PN, Yang Z, Weise T (2009) Benchmark functions for the cec’2010 special session and competition on large-scale global optimization. Tech. rep, Nature Inspired Computation and Applications Laboratory, USTC, ChinaGoogle Scholar
  33. 33.
    Vajda A (2011) Multi-core and many-core processor architectures. In: Programming Many-Core Chips. Springer, pp 9–43Google Scholar
  34. 34.
    Van Luong T, Melab N, Talbi EG (2013) GPU computing for parallel local search metaheuristic algorithms. IEEE Trans Comput 62(1):173–185MathSciNetCrossRefzbMATHGoogle Scholar
  35. 35.
    Vidal P, Alba E, Luna F (2017) Solving optimization problems using a hybrid systolic search on GPU plus CPU. Soft Comput 21(12):3227–3245CrossRefGoogle Scholar
  36. 36.
    Wilton R, Budavari T, Langmead B, Wheelan SJ, Salzberg SL, Szalay AS (2015) Arioc: high-throughput read alignment with GPU-accelerated exploration of the seed-and-extend search space. PeerJ 3:e808CrossRefGoogle Scholar
  37. 37.
    Xavier-De-Souza S, Suykens JAK, Vandewalle J, Bollé D (2010) Coupled simulated annealing. Trans Syst Man Cybern Part B 40(2):320–335. CrossRefGoogle Scholar
  38. 38.
    Yeh WC, Lin JS (2016) New parallel swarm algorithm for smart sensor systems redundancy allocation problems in the Internet of Things. J Supercomput.
  39. 39.
    Yi H, Duan Q, Liao TW (2013) Three improved hybrid metaheuristic algorithms for engineering design optimization. Appl Soft Comput 13(5):2433–2444. CrossRefGoogle Scholar
  40. 40.
    Zhou Y, Zeng J (2015) Massively parallel a* search on a GPU. In: AAAI, pp 1248–1255Google Scholar

Copyright information

© Springer Science+Business Media, LLC, part of Springer Nature 2018

Authors and Affiliations

  • Kayo Gonçalves-e-Silva
    • 1
    Email author
  • Daniel Aloise
    • 2
  • Samuel Xavier-de-Souza
    • 1
  1. 1.Universidade Federal do Rio Grande do NorteNatalBrazil
  2. 2.École Polytechnique de MontréalMontréalCanada

Personalised recommendations