# Parallel synchronous and asynchronous coupled simulated annealing

## Abstract

We propose a parallel synchronous and asynchronous implementation of the coupled simulated annealing (CSA) algorithm in a shared-memory architecture. The original CSA was implemented synchronously in a distributed-memory architecture. It synchronizes at each temperature update, which leads to idling and loss of efficiency when increasing the number of processors. The proposed synchronous CSA (SCSA) is implemented as the original, but in a shared-memory architecture. The proposed asynchronous CSA (ACSA) does not synchronize, allowing a larger parallel efficiency for larger numbers of processors. Results from extensive experiments show that the proposed ACSA presents much better quality of solution when compared to the serial and to the SCSA. The experiments also show that the performance of the proposed ACSA is better than the SCSA for less computationally intensive problems or when a larger number of processing cores are available. Moreover, the parallel efficiency of the ACSA improves by increasing the size of the problem. With the advent of the Multi-core Era, the use of the proposed algorithm becomes more attractive than the original synchronous CSA.

## Keywords

Coupled simulated annealing Global optimization Parallel algorithms Parallel efficiency## Notes

### Acknowledgements

This research was supported by NPAD/UFRN.

## References

- 1.Alba H, Luque G, Nesmachnow S (2013) Parallel metaheuristics: recent advances and new trends. Int Trans Oper Res 20:1–48CrossRefzbMATHGoogle Scholar
- 2.Ament M, Knittel G, Weiskopf D, Strasser W (2010) A parallel preconditioned conjugate gradient solver for the Poisson problem on a multi-GPU platform. In: 2010 18th Euromicro International Conference on Parallel, Distributed and Network-Based Processing (PDP). IEEE, pp 583–592Google Scholar
- 3.Coelho I, Munhoz P, Ochi L, Souza M, Bentes C, Farias R (2016) An integrated CPU-GPU heuristic inspired on variable neighbourhood search for the single vehicle routing problem with deliveries and selective pickups. Int J Prod Res 54(4):945–962CrossRefGoogle Scholar
- 4.Crainic TG, Toulouse M (2010) Parallel meta-heuristics. In: Handbook of metaheuristics. Springer, pp 497–541Google Scholar
- 5.Delévacq A, Delisle P, Gravel M, Krajecki M (2013) Parallel ant colony optimization on graphics processing units. J Parallel Distrib Comput 73(1):52–61. https://doi.org/10.1016/j.jpdc.2012.01.003 CrossRefGoogle Scholar
- 6.Ding K, Zheng S, Tan Y (2013) A GPU-based parallel fireworks algorithm for optimization. In: Proceedings of the 15th Annual Conference on Genetic and Evolutionary Computation. ACM, pp 9–16Google Scholar
- 7.Hemmati-Sarapardeh A, Shokrollahi A, Tatar A, Gharagheizi F, Mohammadi AH, Naseri A (2014) Reservoir oil viscosity determination using a rigorous approach. Fuel 116:39–48. https://doi.org/10.1016/j.fuel.2013.07.072 CrossRefGoogle Scholar
- 8.Hong B, He Z (2011) An asynchronous multithreaded algorithm for the maximum network flow problem with nonblocking global relabeling heuristic. IEEE Trans Parallel Distrib Syst 22(6):1025–1033. https://doi.org/10.1109/TPDS.2010.156 CrossRefGoogle Scholar
- 9.Iturriaga S, Nesmachnow S, Luna F, Alba E (2015) A parallel local search in CPU/GPU for scheduling independent tasks on large heterogeneous computing systems. J Supercomput 71(2):648–672CrossRefGoogle Scholar
- 10.Kadjo D, Ayoub R, Kishinevsky M, Gratz PV (2015) A control-theoretic approach for energy efficient CPU–GPU subsystem in mobile platforms. In: Proceedings of the 52nd Annual Design Automation Conference. ACM, p 62Google Scholar
- 11.Kamari A, Hemmati-Sarapardeh A, Mirabbasi SM, Nikookar M, Mohammadi AH (2013) Prediction of sour gas compressibility factor using an intelligent approach. Fuel Process Technol 116:209–216. https://doi.org/10.1016/j.fuproc.2013.06.004 CrossRefGoogle Scholar
- 12.Kennedy J, Eberhart R (1995) Particle swarm optimization. In: IEEE International Conference on Neural Networks. Proceedings, vol 4, pp 1942–1948. https://doi.org/10.1109/ICNN.1995.488968
- 13.Kider JT, Henderson M, Likhachev M, Safonova A (2010) High-dimensional planning on the GPU. In: 2010 IEEE International Conference on Robotics and Automation (ICRA). IEEE, pp 2515–2522Google Scholar
- 14.Kirkpatrick S, Gelatt CD, Vecchi MP (1983) Optimization by simulated annealing. Science 220(4598):671–680MathSciNetCrossRefzbMATHGoogle Scholar
- 15.Liao TW (2010) Two hybrid differential evolution algorithms for engineering design optimization. Appl Soft Comput 10(4):1188–1199. https://doi.org/10.1016/j.asoc.2010.05.007 CrossRefGoogle Scholar
- 16.Liepins GE, Hilliard MR (1989) Genetic algorithms: foundations and applications. Ann Oper Res 21(1):31–57. https://doi.org/10.1007/BF02022092 CrossRefzbMATHGoogle Scholar
- 17.Liu CM, Wong T, Wu E, Luo R, Yiu SM, Li Y, Wang B, Yu C, Chu X, Zhao K et al (2012) Soap3: ultra-fast GPU-based parallel alignment tool for short reads. Bioinformatics 28(6):878–879CrossRefGoogle Scholar
- 18.Liu YY, Wang S (2015) A scalable parallel genetic algorithm for the generalized assignment problem. Parallel Comput 46:98–119. https://doi.org/10.1016/j.parco.2014.04.008 MathSciNetCrossRefGoogle Scholar
- 19.Lou Z, Reinitz J (2016) Parallel simulated annealing using an adaptive resampling interval. Parallel Comput 53:23–31. https://doi.org/10.1016/j.parco.2016.02.001 MathSciNetCrossRefGoogle Scholar
- 20.Luong TV, Melab N, Talbi EG (2013) GPU computing for parallel local search metaheuristic algorithms. IEEE Trans Comput 62(1):173–185. https://doi.org/10.1109/TC.2011.206 MathSciNetCrossRefzbMATHGoogle Scholar
- 21.Mahmoodi NM, Arabloo M, Abdi J (2014) Laccase immobilized manganese ferrite nanoparticle: synthesis and LSSVM intelligent modeling of decolorization. Water Res 67:216–226. https://doi.org/10.1016/j.watres.2014.09.011 CrossRefGoogle Scholar
- 22.Mehrkanoon S, Alzate C, Mall R, Langone R, Suykens J (2015) Multiclass semisupervised learning based upon kernel spectral clustering. IEEE Trans Neural Netw Learn Syst 26(4):720–733. https://doi.org/10.1109/TNNLS.2014.2322377 MathSciNetCrossRefGoogle Scholar
- 23.Olenšek J, Tuma T, Puhan J, Burmen Á (2011) A new asynchronous parallel global optimization method based on simulated annealing and differential evolution. Appl Soft Comput 11(1):1481–1489. https://doi.org/10.1016/j.asoc.2010.04.019 CrossRefGoogle Scholar
- 24.Open MPI Documentation: Open MPI v3.0.0 (2018). https://www.open-mpi.org/doc/
- 25.OpenMP Architecture Review Board: OpenMP application program interface version 4.0 (2013). http://www.openmp.org/mp-documents/OpenMP4.0.0.pdf
- 26.Rafiee-Taghanaki S, Arabloo M, Chamkalani A, Amani M, Zargari MH, Adelzadeh MR (2013) Implementation of SVM framework to estimate PVT properties of reservoir oil. Fluid Phase Equilib 346:25–32. https://doi.org/10.1016/j.fluid.2013.02.012 CrossRefGoogle Scholar
- 27.Rucinski M, Izzo D, Biscani F (2010) On the impact of the migration topology on the island model. Parallel Comput 36(10–11):555–571. https://doi.org/10.1016/j.parco.2010.04.002 CrossRefzbMATHGoogle Scholar
- 28.Santander-Jiménez S, Vega-Rodríguez MA (2015) Parallel multiobjective metaheuristics for inferring phylogenies on multicore clusters. IEEE Trans Parallel Distrib Syst 26(6):1678–1692. https://doi.org/10.1109/TPDS.2014.2325828 CrossRefGoogle Scholar
- 29.Souza DS, Santos HG, Coelho IM (2017) A hybrid heuristic in GPU–CPU based on scatter search for the generalized assignment problem. Procedia Comput Sci 108:1404–1413CrossRefGoogle Scholar
- 30.de Souza SX (2007) Optimisation and robustness of cellular neural networks. Ph.D. thesis, Katholieke Universiteit Leuven, BelgiumGoogle Scholar
- 31.Subramanian A, Drummond L, Bentes C, Ochi L, Farias R (2010) A parallel heuristic for the vehicle routing problem with simultaneous pickup and delivery. Comput Oper Res 37(11):1899–1911. https://doi.org/10.1016/j.cor.2009.10.011 CrossRefzbMATHGoogle Scholar
- 32.Tang K, Li X, Suganthan PN, Yang Z, Weise T (2009) Benchmark functions for the cec’2010 special session and competition on large-scale global optimization. Tech. rep, Nature Inspired Computation and Applications Laboratory, USTC, ChinaGoogle Scholar
- 33.Vajda A (2011) Multi-core and many-core processor architectures. In: Programming Many-Core Chips. Springer, pp 9–43Google Scholar
- 34.Van Luong T, Melab N, Talbi EG (2013) GPU computing for parallel local search metaheuristic algorithms. IEEE Trans Comput 62(1):173–185MathSciNetCrossRefzbMATHGoogle Scholar
- 35.Vidal P, Alba E, Luna F (2017) Solving optimization problems using a hybrid systolic search on GPU plus CPU. Soft Comput 21(12):3227–3245CrossRefGoogle Scholar
- 36.Wilton R, Budavari T, Langmead B, Wheelan SJ, Salzberg SL, Szalay AS (2015) Arioc: high-throughput read alignment with GPU-accelerated exploration of the seed-and-extend search space. PeerJ 3:e808CrossRefGoogle Scholar
- 37.Xavier-De-Souza S, Suykens JAK, Vandewalle J, Bollé D (2010) Coupled simulated annealing. Trans Syst Man Cybern Part B 40(2):320–335. https://doi.org/10.1109/TSMCB.2009.2020435 CrossRefGoogle Scholar
- 38.Yeh WC, Lin JS (2016) New parallel swarm algorithm for smart sensor systems redundancy allocation problems in the Internet of Things. J Supercomput. https://doi.org/10.1007/s11227-016-1903-8
- 39.Yi H, Duan Q, Liao TW (2013) Three improved hybrid metaheuristic algorithms for engineering design optimization. Appl Soft Comput 13(5):2433–2444. https://doi.org/10.1016/j.asoc.2012.12.004 CrossRefGoogle Scholar
- 40.Zhou Y, Zeng J (2015) Massively parallel a* search on a GPU. In: AAAI, pp 1248–1255Google Scholar