Advertisement

Constraints

, Volume 23, Issue 1, pp 1–43 | Cite as

Accelerating exact and approximate inference for (distributed) discrete optimization with GPUs

  • Ferdinando FiorettoEmail author
  • Enrico Pontelli
  • William Yeoh
  • Rina Dechter
Article

Abstract

Discrete optimization is a central problem in artificial intelligence. The optimization of the aggregated cost of a network of cost functions arises in a variety of problems including Weighted Constraint Programs (WCSPs), Distributed Constraint Optimization (DCOP), as well as optimization in stochastic variants such as the tasks of finding the most probable explanation (MPE) in belief networks. Inference-based algorithms are powerful techniques for solving discrete optimization problems, which can be used independently or in combination with other techniques. However, their applicability is often limited by their compute intensive nature and their space requirements. This paper proposes the design and implementation of a novel inference-based technique, which exploits modern massively parallel architectures, such as those found in Graphical Processing Units (GPUs), to speed up the resolution of exact and approximated inference-based algorithms for discrete optimization. The paper studies the proposed algorithm in both centralized and distributed optimization contexts. The paper demonstrates that the use of GPUs provides significant advantages in terms of runtime and scalability, achieving up to two orders of magnitude in speedups and showing a considerable reduction in execution time (up to 345 times faster) with respect to a sequential version.

Keywords

GPU WCSP MPE DCOP (Mini-)bucket elimination (A)DPOP 

Notes

Acknowledgements

We thank the anonymous reviewers for their comments. This research is partially supported by the National Science Foundation under grants 1345232, 1401639, 1458595, 1526842, and 1550662. The views and conclusions contained in this document are those of the authors and should not be interpreted as representing the official policies, either expressed or implied, of the sponsoring organizations, agencies, or the U.S. government.

References

  1. 1.
    Abdennadher, S., & Schlenker, H. (1999). Nurse scheduling using constraint logic programming. In Proceedings of the conference on innovative applications of artificial intelligence (IAAI) (pp. 838–843).Google Scholar
  2. 2.
    Allouche, D., André, I., Barbe, S., Davies, J., de Givry, S., Katsirelos, G., O’Sullivan, B., Prestwich, S.D., Schiex, T., & Traoré, S. (2014). Computational protein design as an optimization problem. Artificial Intelligence, 212, 59–79.MathSciNetCrossRefzbMATHGoogle Scholar
  3. 3.
    Allouche, D., de Givry, S., Nguyen, H., & Schiex, T. (2013). Toulbar2 to solve Weighted Partial max-SAT. Tech. rep. INRA.Google Scholar
  4. 4.
    Apt, K. (2003). Principles of constraint programming. Cambridge University Press.Google Scholar
  5. 5.
    Arbelaez, A., & Codognet, P. (2014). A GPU implementation of parallel constraint-based local search. In Proceedings of the euromicro international conference on parallel, distributed and network-based processing (PDP) (pp. 648–655).Google Scholar
  6. 6.
    Barabási, A. L., & Albert, R. (1999). Emergence of scaling in random networks. Science, 286(5439), 509–512.MathSciNetCrossRefzbMATHGoogle Scholar
  7. 7.
    Bistaffa, F., Bomberi, N., & Farinelli, A. (2016). CUBE: a CUDA approach for bucket elimination on GPUs. In Proceedings of the European conference on artificial intelligence (ECAI), p. to appear.Google Scholar
  8. 8.
    Bistarelli, S., Montanari, U., & Rossi, F. (1997). Semiring-based constraint satisfaction and optimization. Journal of the ACM, 44(2), 201–236.MathSciNetCrossRefzbMATHGoogle Scholar
  9. 9.
    Boyer, V., El Baz, D., & Elkihel, M. (2012). Solving knapsack problems on GPU. Computers & Operations Research, 39(1), 42–47.MathSciNetCrossRefzbMATHGoogle Scholar
  10. 10.
    Brito, I., & Meseguer, P. (2010). Improving DPOP with function filtering. In Proceedings of the international conference on autonomous agents and multiagent systems (AAMAS) (pp. 141–158).Google Scholar
  11. 11.
    Burke, E.K., De Causmaecker, P., Berghe, G.V., & Van Landeghem, H. (2004). The state of the art of nurse rostering. Journal of scheduling, 7(6), 441–499.MathSciNetCrossRefzbMATHGoogle Scholar
  12. 12.
    Campeotto, F., Dovier, A., Fioretto, F., & Pontelli, E. (2014). A GPU implementation of large neighborhood search for solving constraint optimization problems. In Proceedings of the european conference on artificial intelligence (ECAI) (pp. 189–194).Google Scholar
  13. 13.
    Campeotto, F., Palù, A.D., Dovier, A., Fioretto, F., & Pontelli, E. (2013). A constraint solver for flexible protein model. Journal of Artificial Intelligence Research, 48, 953–1000.MathSciNetGoogle Scholar
  14. 14.
    Chakroun, I., Mezmaz, M.S., Melab, N., & Bendjoudi, A. (2013). Reducing thread divergence in a GPU-accelerated branch-and-bound algorithm. Concurrency and Computation: Practice and Experience, 25(8), 1121–1136.CrossRefGoogle Scholar
  15. 15.
    Dechter, R. (1999). Bucket elimination: a unifying framework for reasoning. Artificial Intelligence, 113(1), 41–85.MathSciNetCrossRefzbMATHGoogle Scholar
  16. 16.
    Dechter, R. (2003). Constraint processing. San Francisco: Morgan Kaufmann Publishers Inc.zbMATHGoogle Scholar
  17. 17.
    Dechter, R. (2013). Reasoning with probabilistic and deterministic graphical models: exact algorithms. Synthesis Lectures on Artificial Intelligence and Machine Learning, 7(3), 1–191.CrossRefzbMATHGoogle Scholar
  18. 18.
    Dechter, R., & Pearl, J. (1988). Network-based heuristics for constraint-satisfaction problems. Springer.Google Scholar
  19. 19.
    Dechter, R., & Rish, I. (2003). Mini-buckets: a general scheme for bounded inference. Journal of the ACM, 50(2), 107–153.MathSciNetCrossRefzbMATHGoogle Scholar
  20. 20.
    Diamos, G.F., Ashbaugh, B., Maiyuran, S., Kerr, A., Wu, H., & Yalamanchili, S. (2011). SIMD re-convergence at thread frontiers. In Proceedings of the annual IEEE/ACM international symposium on microarchitecture (pp. 477–488).Google Scholar
  21. 21.
    Dovier, A., Formisano, A., & Pontelli, E. (2013). Autonomous agents coordination: action languages meet CLP() and Linda. Theory and Practice of Logic Programming, 13(2), 149–173.MathSciNetCrossRefzbMATHGoogle Scholar
  22. 22.
    Edelkamp, S., Jabbar, S., & Schrödl, S. (2004). External A*. In Advances in artificial intelligence: 27th annual German conference on AI, (KI) 2004 (pp. 226–240).Google Scholar
  23. 23.
    Farinelli, A., Rogers, A., Petcu, A., & Jennings, N. (2008). Decentralised coordination of low-power embedded devices using the Max-Sum algorithm. In Proceedings of the international conference on autonomous agents and multiagent systems (AAMAS) (pp. 639–646).Google Scholar
  24. 24.
    Fioretto, F., Dovier, A., & Pontelli, E. (2015). Constrained community-based gene regulatory network inference. ACM Trans. Model. Comput. Simul., 25(2), 11.MathSciNetCrossRefzbMATHGoogle Scholar
  25. 25.
    Fioretto, F., Le, T., Yeoh, W., Pontelli, E., & Son, T.C. (2014). Improving DPOP with branch consistency for solving distributed constraint optimization problems. In Proceedings of the international conference on principles and practice of constraint programming (CP) (pp. 307–323).Google Scholar
  26. 26.
    Fioretto, F., Le, T., Yeoh, W., Pontelli, E., & Son, T.C. (2015). Exploiting GPUs in solving (distributed) constraint optimization problems with dynamic programming. In Proceedings of the international conference on principles and practice of constraint programming (CP) (pp. 121– 139).Google Scholar
  27. 27.
    Fioretto, F., Yeoh, W., & Pontelli, E. (2016). A dynamic programming-based MCMC framework for solving DCOPs with GPUs. In Proceedings of the international conference on principles and practice of constraint programming (CP) (pp. 813–831).CrossRefGoogle Scholar
  28. 28.
    Fioretto, F., Yeoh, W., & Pontelli, E. (2016). Multi-variable agent decomposition for DCOPs. In Proceedings of the AAAI conference on artificial intelligence (AAAI) (pp. 2480–2486).Google Scholar
  29. 29.
    Fioretto, F., Yeoh, W., & Pontelli, E. (2017). A multiagent system approach to scheduling devices in smart homes. In Proceedings of the international conference on autonomous agents and multiagent systems (AAMAS) (pp. 981–989).Google Scholar
  30. 30.
    Fioretto, F., Yeoh, W., Pontelli, E., Ma, Y., & Ranade, S. (2017). A DCOP approach to the economic dispatch with demand response. In Proceedings of the international conference on autonomous agents and multiagent systems (AAMAS) (pp. 981–989).Google Scholar
  31. 31.
    Fishelson, M., & Geiger, D. (2002). Exact genetic linkage computations for general pedigrees. Bioinformatics, 18(suppl 1), S189–S198.CrossRefGoogle Scholar
  32. 32.
    Friedman, N., Linial, M., Nachman, I., & Pe’er, D. (2000). Using bayesian networks to analyze expression data. Journal of Computational Biology, 7(3-4), 601–620.CrossRefGoogle Scholar
  33. 33.
    Gaudreault, J., Frayret, J.M., & Pesant, G. (2009). Distributed search for supply chain coordination. Computers in Industry, 60(6), 441–451.CrossRefGoogle Scholar
  34. 34.
    Gupta, S., Yeoh, W., Pontelli, E., Jain, P., & Ranade, S.J. (2013). Modeling microgrid islanding problems as DCOPs. In North American power symposium (NAPS) (pp. 1–6): IEEE.Google Scholar
  35. 35.
    Hamadi, Y., Bessière, C., & Quinqueton, J. (1998). Distributed intelligent backtracking. In Proceedings of the European conference on artificial intelligence (ECAI) (pp. 219–223).Google Scholar
  36. 36.
    Han, T.D., & Abdelrahman, T.S. (2011). Reducing branch divergence in GPU programs. In Proceedings of the fourth workshop on general purpose processing on graphics processing units (pp. 3:1–3:8). New York: ACM Press.Google Scholar
  37. 37.
    Kask, K., Dechter, R., & Gelfand, A.E. (2012). Beem: bucket elimination with external memory. arXiv:1203.3487.
  38. 38.
    Kumar, A., Faltings, B., & Petcu, A. (2009). Distributed constraint optimization with structured resource constraints. In Proceedings of the international conference on autonomous agents and multiagent systems (AAMAS) (pp. 923–930).Google Scholar
  39. 39.
    Lalami, M.E., El Baz, D., & Boyer, V. (2011). Multi GPU implementation of the simplex algorithm. In Proceedings of the international conference on high performance computing and communication (HPCC), (Vol. 11 pp. 179–186).Google Scholar
  40. 40.
    Larrosa, J. (2002). Node and arc consistency in weighted csp. In Proceedings of the AAAI conference on artificial intelligence (AAAI) (pp. 48–53).Google Scholar
  41. 41.
    Lars, O., & Rina, D. (2017). And/or branch-and-bound on a computational grid. Journal of Artificial Intelligence Research (to appear).Google Scholar
  42. 42.
    Le, T., Fioretto, F., Yeoh, W., Son, T.C., & Pontelli, E. (2016). ER-DCOPS: a framework for distributed constraint optimization with uncertainty in constraint utilities. In Proceedings of the international conference on autonomous agents and multiagent systems (AAMAS) (pp. 605– 614).Google Scholar
  43. 43.
    Lerner, U., Parr, R., Koller, D., Biswas, G., & et al. (2000). Bayesian fault detection and diagnosis in dynamic systems. In AAAI/IAAI (pp. 531–537).Google Scholar
  44. 44.
    Lim, H., Yuan, C., & Hansen, E.A. (2010). Scaling up map search in bayesian networks using external memory. On Probabilistic Graphical Models, 177.Google Scholar
  45. 45.
    Maheswaran, R., Tambe, M., Bowring, E., Pearce, J., & Varakantham, P. (2004). Taking DCOP to the real world: efficient complete solutions for distributed event scheduling. In Proceedings of the international conference on autonomous agents and multiagent systems (AAMAS) (pp. 310–317).Google Scholar
  46. 46.
    Marinescu, R., & Dechter, R. (2009). Memory intensive and/or search for combinatorial optimization in graphical models. Artificial Intelligence, 173(16-17), 1492–1524.MathSciNetCrossRefzbMATHGoogle Scholar
  47. 47.
    Modi, P., Shen, W.M., Tambe, M., & Yokoo, M. (2005). ADOPT: asynchronous distributed constraint optimization with quality guarantees. Artificial Intelligence, 161 (1–2), 149–180.MathSciNetCrossRefzbMATHGoogle Scholar
  48. 48.
    Montanari, U. (1974). Networks of constraints: fundamental properties and applications to picture processing. Information Sciences, 7, 95–132.MathSciNetCrossRefzbMATHGoogle Scholar
  49. 49.
    Pawłowski, K., Kurach, K., Michalak, T., & Rahwan, T. (2104). Coalition structure generation with the graphic processor unit. Tech. Rep. CS-RR-13-07, Department of Computer Science, University of Oxford.Google Scholar
  50. 50.
    Pearl, J. (1988). Probabilistic reasoning in intelligent systems: Networks of plausible inference. San Francisco: Morgan Kaufmann Publishers Inc.zbMATHGoogle Scholar
  51. 51.
    Pesant, G. (2004). A regular language membership constraint for finite sequences of variables. In Proceedings of the international conference on principles and practice of constraint programming (CP) (pp. 482–495).Google Scholar
  52. 52.
    Petcu, A., & Faltings, B. (2005). Approximations in distributed optimization. In Proceedings of the international conference on principles and practice of constraint programming (CP) (pp. 802–806).Google Scholar
  53. 53.
    Petcu, A., & Faltings, B. (2005). A scalable method for multiagent constraint optimization. In Proceedings of the international joint conference on artificial intelligence (IJCAI) (pp. 1413–1420).Google Scholar
  54. 54.
    Quimper, C.G., & Walsh, T. (2006). Global grammar constraints. In Proceedings of the international conference on principles and practice of constraint programming (CP) (pp. 751–755): Springer.Google Scholar
  55. 55.
    Rodrigues, L., & Magatao, L. (2007). Enhancing supply chain decisions using constraint programming: a case study. In MICAI 2007: advances in artificial intelligence, (Vol. LNCS 4827 pp. 1110–1121): Springer.Google Scholar
  56. 56.
    Rossi, F., van Beek, P., & Walsh, T. (eds.) (2006). Handbook of constraint programming. Elsevier.Google Scholar
  57. 57.
    Rust, P., Picard, G., & Ramparany, F. (2016). Using message-passing DCOP algorithms to solve energy-efficient smart environment configuration problems. In Proceedings of the international joint conference on artificial intelligence (IJCAI) (pp. 468–474).Google Scholar
  58. 58.
    Sanders, J., & Kandrot, E. (2010). CUDA By example. An introduction to general-purpose GPU programming. Addison Wesley.Google Scholar
  59. 59.
    Sandholm, T. (2002). Algorithm for optimal winner determination in combinatorial auctions. Artificial Intelligence, 135(1), 1–54.MathSciNetCrossRefzbMATHGoogle Scholar
  60. 60.
    Schiex, T., Fargier, H., Verfaillie, G., & et al. (1995). Valued constraint satisfaction problems: Hard and easy problems. Proceedings of the International Joint Conference on Artificial Intelligence (IJCAI), 95, 631–639.Google Scholar
  61. 61.
    Shapiro, L.G., & Haralick, R.M. (1981). Structural descriptions and inexact matching. IEEE Transactions on Pattern Analysis and Machine Intelligence, 3(5), 504–519.CrossRefGoogle Scholar
  62. 62.
    Silberstein, M., Schuster, A., Geiger, D., Patney, A., & Owens, J.D. (2008). Efficient computation of sum-products on gpus through software-managed cache. In Proceedings of the 22nd annual international conference on supercomputing (pp. 309–318): ACM.Google Scholar
  63. 63.
    Sturtevant, N.R., & Rutherford, M.J. (2013). Minimizing writes in parallel external memory search. In Proceedings of the international joint conference on artificial intelligence (IJCAI).Google Scholar
  64. 64.
    Sultanik, E., Modi, P.J., & Regli, W.C. (2007). On modeling multiagent task scheduling as a distributed constraint optimization problem. In Proceedings of the international joint conference on artificial intelligence (IJCAI) (pp. 1531–1536).Google Scholar
  65. 65.
    Trick, M.A. (2003). A dynamic programming approach for consistency and propagation for knapsack constraints. Annals of Operations Research, 118(1-4), 73–84.MathSciNetCrossRefzbMATHGoogle Scholar
  66. 66.
    Yeoh, W., Felner, A., & Koenig, S. (2010). Bnb-ADOPT: an asynchronous branch-and-bound DCOP algorithm. Journal of Artificial Intelligence Research, 38, 85–133.zbMATHGoogle Scholar
  67. 67.
    Yeoh, W., & Yokoo, M. (2012). Distributed problem solving. AI Magazine, 33 (3), 53–65.CrossRefGoogle Scholar
  68. 68.
    Zivan, R., Yedidsion, H., Okamoto, S., Glinton, R., & Sycara, K. (2015). Distributed constraint optimization for teams of mobile sensing agents. Journal of Autonomous Agents and Multi-Agent Systems, 29(3), 495–536.CrossRefGoogle Scholar

Copyright information

© Springer Science+Business Media, LLC 2017

Authors and Affiliations

  1. 1.Industrial and Operations EngineeringUniversity of MichiganAnn ArborUSA
  2. 2.Computer ScienceNew Mexico State UniversityLas CrucesUSA
  3. 3.Computer Science and EngineeringWashington University in St. LouisSt. LouisUSA
  4. 4.School of Information and Computer ScienceUniversity of CaliforniaIrvineUSA

Personalised recommendations