Skip to main content

FANG: Fast and Efficient Successor-State Generation for Heuristic Optimization on GPUs

  • Conference paper
  • First Online:
Algorithms and Architectures for Parallel Processing (ICA3PP 2019)

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 11944))

Abstract

Many optimization problems (especially nonsmooth ones) are typically solved by genetic, evolutionary, or metaheuristic-based algorithms. However, these genetic approaches and other related papers typically assume the existence of a neighborhood or successor-state function N(x), where x is a candidate state. The implementation of such a function can become arbitrarily complex in the field of combinatorial optimization. Many N(x) functions for a huge variety of different domain-specific problems have been developed in the past to solve this general problem. However, it has always been a great challenge to port or realize these functions on a massively-parallel architecture like a Graphics Processing Unit (GPU). We present a GPU-based method called FANG that implements a generic and reusable N(x) for arbitrary domains in the field of combinatorial optimization. It can be customized to satisfy domain-specific requirements and leverages the underlying hardware in a fast and efficient way by construction. Moreover, our method has a high scalability with respect to the number of input states and the complexity of a single state. Measurements show significant performance improvements compared to traditional exploration approaches leveraging the CPU on our evaluation scenarios.

This work was funded by the Germany Federal Ministry for Economic Affairs and Energy (BMWi): Project BloGPV (grant number 01MD18001B).

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    We will refer to a single SIMT unit as warp in the scope of this paper.

  2. 2.

    The number of possibilities can be seen as a subset of all possible successor states from Fig. 1-2.

  3. 3.

    Without limiting the parallel execution of multiple groups per multiprocessor.

References

  1. Abdelkafi, O., Chebil, K., Khemakhem, M.: Parallel local search on GPU and CPU with OpenCL language. In: Proceedings of the First International Conference on Reasoning and Optimization in Information Systems, September 2013

    Google Scholar 

  2. Campeotto, F., Dovier, A., Fioretto, F., Pontelli, E.: A GPU implementation of large neighborhood search for solving constraint optimization problems. In: Proceedings of the Twenty-First European Conference on Artificial Intelligence (2014)

    Google Scholar 

  3. Campeotto, F., Dal Palù, A., Dovier, A., Fioretto, F., Pontelli, E.: Exploring the use of GPUs in constraint solving. In: Flatt, M., Guo, H.-F. (eds.) PADL 2014. LNCS, vol. 8324, pp. 152–167. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-04132-2_11

    Chapter  Google Scholar 

  4. Focacci, F., Laburthe, F., Lodi, A.: Local search and constraint programming. In: Milano, M. (ed.) Constraint and Integer Programming. Operations Research/Computer Science Interfaces Series, vol. 27, pp. 293–329. Springer, Boston (2004). https://doi.org/10.1007/978-1-4419-8917-8_9

    Chapter  MATH  Google Scholar 

  5. Ghorpade, S., Kamalapur, S.: Solution level parallelization of local search metaheuristic algorithm on GPU. Int. J. Comput. Sci. Mob. Comput. (2014)

    Google Scholar 

  6. Köster, M., Leißa, R., Hack, S., Membarth, R., Slusallek, P.: Code refinement of stencil codes. Parallel Process. Lett. (PPL) 24, 1–16 (2014)

    MathSciNet  Google Scholar 

  7. Luong, T.V., Loukil, L., Melab, N., Talbi, E.: A GPU-based iterated tabu search for solving the quadratic 3-dimensional assignment problem. In: ACS/IEEE International Conference on Computer Systems and Applications (AICCSA) (2010)

    Google Scholar 

  8. Luong, T.V., Melab, N., Talbi, E.G.: Large neighborhood local search optimization on graphics processing units. In: Workshop on Large-Scale Parallel Processing (LSPP) in Conjunction with the International Parallel & Distributed Processing Symposium (IPDPS) (2010)

    Google Scholar 

  9. Luong, T.V., Melab, N., Talbi, E.G.: Neighborhood structures for GPU-based local search algorithms. Parallel Process. Lett. 20, 307–324 (2010)

    Article  MathSciNet  Google Scholar 

  10. Marsaglia, G.: Xorshift RNGs. J. Stat. Softw. 8, 1–6 (2003)

    Google Scholar 

  11. Melab, N., Luong, T.V., Boufaras, K., Talbi, E.G.: ParadisEO-MO-GPU: a framework for parallel GPU-based local search metaheuristics. In: 11th International Work-Conference on Artificial Neural Networks (2011)

    Google Scholar 

  12. Ming Lam, Y., Hung Tsoi, K., Luk, W.: Parallel neighbourhood search on many-core platforms. Int. J. Comput. Sci. Eng. 8, 281–293 (2013)

    Google Scholar 

  13. Mohammad Harun Rashid, L.T.: Parallel combinatorial optimization heuristics with GPUs. Adv. Sci. Technol. Eng. Syst. J. 3 (2018)

    Google Scholar 

  14. Munawar, A., Wahib, M., Munetomo, M., Akama, K.: Hybrid of genetic algorithm and local search to solve MAX-SAT problem using nVidia CUDA framework. Genet. Program. Evolvable Mach. 10, 391 (2009)

    Article  Google Scholar 

  15. Novoa, C., Qasem, A., Chaparala, A.: A SIMD tabu search implementation for solving the quadratic assignment problem with GPU acceleration. In: Proceedings of the 2015 XSEDE Conference: Scientific Advancements Enabled by Enhanced Cyberinfrastructure (2015)

    Google Scholar 

  16. NVIDIA: Faster Parallel Reductions on Kepler (2014)

    Google Scholar 

  17. NVIDIA: CUDA C Programming Guide v10 (2019)

    Google Scholar 

  18. Schulz, C., Hasle, G., Brodtkorb, A.R., Hagen, T.R.: GPU computing in discrete optimization. Part II: survey focused on routing problems. EURO J. Transp. Logistics 2, 159–186 (2013)

    Article  Google Scholar 

  19. Talbi, E.G.: Metaheuristics: From Design to Implementation. Wiley, Hoboken (2009)

    Book  Google Scholar 

Download references

Acknowledgments

The authors would like to thank Wladimir Panfilenko and Thomas Schmeyer for their suggestions and feedback regarding our method. Furthermore, we would like to thank Gian-Luca Kiefer for additional feedback on the paper. Special thanks to Wladimir at this point for adding the concept of integer-based bit sets for active variables in a single state. This reduces global-memory consumption and improves performance of searching for active variables.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Marcel Köster .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2020 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Köster, M., Groß, J., Krüger, A. (2020). FANG: Fast and Efficient Successor-State Generation for Heuristic Optimization on GPUs. In: Wen, S., Zomaya, A., Yang, L. (eds) Algorithms and Architectures for Parallel Processing. ICA3PP 2019. Lecture Notes in Computer Science(), vol 11944. Springer, Cham. https://doi.org/10.1007/978-3-030-38991-8_15

Download citation

Publish with us

Policies and ethics