Abstract
With the development of machine learning in various fields, it can also be applied to combinatorial optimization problems, automatically discovering generic and fast heuristic algorithms based on training data, and requires fewer theoretical and empirical knowledge. Pointer network improves the attention mechanism, instead of allocating different attention to hidden states of encoder to generate context vectors, using attention as a pointer to select an element of the input sequence at every step of decoding, which solves the problem of variable dictionary size of the output sequence. Pointer net (Ptr-Net) applied to three combinatorial optimization problems, convex hull, Delaunay triangulation, and traveling salesman problem (TSP), obtains good approximate solutions. Point matching is also a special kind of combinatorial optimization problems that is to obtain the optimal corresponding references, which can be modeled by Ptr-Net. However, Ptr-Net can’t be used to solve point matching problem because it doesn’t take full advantage of the correspondences between the two point sets. We propose multi-pointer network, which draws the idea from multi-label classification, to address this limitation by pointing out a set of input elements. These applications are all based on supervised learning to approximate expected known solutions. However, high-quality labeled data is often expensive, unreliable, or simply unavailable and may be infeasible for new problem statements, making supervised learning being unpractical. Reinforcement learning, as another research hotspot in the field of machine learning, does not require labeled sample data. It interacts with the environment through trial-and-error mechanism and focuses more on learning problem-solving strategies. We introduce a framework to tackle combinatorial optimization problems using neural networks and reinforcement learning, focusing on the traveling salesman problem. We also introduce a framework, a unique combination of reinforcement learning and graph embedding network, to solve graph optimization problems, focusing on maximum cut (MAXCUT) and minimum vertex cover (MVC) problems.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Bahdanau, D., Cho, K., Bengio, Y.: Neural machine translation by jointly learning to align and translate (2014). arXiv preprint arXiv:1409.0473
Bello, I., Pham, H., Le, Q.V., Norouzi, M., Bengio, S.: Neural combinatorial optimization with reinforcement learning (2016). arXiv preprint arXiv:1611.09940
Bing, L.I., Jiang, W.: Chaos optimization method and its application. In: Control Theory & Applications (1997)
Bonabeau, E., Dorigo, M., Theraulaz, G.: Swarm Intelligence: From Natural to Artificial Systems. Oxford University Press, Inc., Oxford (1999)
Cho, K., Van Merrienboer, B., Gulcehre, C., Bahdanau, D., Bougares, F., Schwenk, H., Bengio, Y.: Learning phrase representations using RNN encoder–decoder for statistical machine translation. In: Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing, pp. 1724–1734 (2014)
Dai, H., Dai, B., Song, L.: Discriminative embeddings of latent variable models for structured data. In: International Conference on Machine Learning, pp. 2702–2711 (2016)
Dijkstra, E.W.: A note on two problems in connection with graphs. Numer. Math. 1(1), 269–271 (1959)
Donahue, J., Hendricks, L.A., Guadarrama, S., Rohrbach, M., Venugopalan, S., Darrell, T., Saenko, K.: Long-term recurrent convolutional networks for visual recognition and description. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 2625–2634 (2015)
Dorigo, M., Maniezzo, V., Colorni, A.: Ant system: optimization by a colony of cooperating agents. Syst. Man Cybern. 26(1), 29–41 (1996)
Fortmann, T.E., Bar-Shalom, Y., Scheffe, M.: Multi-target tracking using joint probabilistic data association. In: Conference on Decision and Control Including the Symposium on Adaptive Processes, pp. 807–812. IEEE, Piscataway (1980)
Girshick, R., Donahue, J., Darrell, T., Malik, J.: Rich feature hierarchies for accurate object detection and semantic segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 580–587 (2014)
Golberg, D.E.: Genetic Algorithms in Search, Optimization, and Machine Learning, vol. 102, p. 36. Addison Wesley, Boston (1989)
Gower, J.C., Ross, G.J.S.: Minimum spanning trees and single linkage cluster analysis. Appl. Stat. 18(1), 54–64 (1969)
Graves, A., Mohamed, A.-R., Hinton, G.: Speech recognition with deep recurrent neural networks. In: 2013 IEEE International Conference on Acoustics, Speech and Signal Processing, pp. 6645–6649. IEEE, Piscataway (2013)
Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Comput. 9(8), 1735–1780 (1997)
Khalil, E., Dai, H., Zhang, Y., Dilkina, B., Song, L.: Learning combinatorial optimization algorithms over graphs. In: Advances in Neural Information Processing Systems, pp. 6348–6358 (2017)
Kirkpatrick, S., Gelatt, C.D., Vecchi, M.P.: Optimization by simulated annealing. Science 220(4598), 671–680 (1983)
Krizhevsky, A., Sutskever, I., Hinton, G.E.: ImageNet classification with deep convolutional neural networks. In: International Conference on Neural Information Processing Systems, pp. 1097–1105 (2012)
Leordeanu, M., Sukthankar, R., Hebert, M.: Unsupervised learning for graph matching. Int. J. Comput. Vis. 96(1), 28–45 (2012)
Li, Y.: Deep reinforcement learning: an overview (2017). arXiv preprint arXiv:1701.07274
Milan, A., Rezatofighi, S.H., Garg, R., Dick, A., Reid, I.: Data-driven approximations to NP-hard problems. In: Thirty-First AAAI Conference on Artificial Intelligence, pp. 1453–1459 (2017)
Mnih, V., Kavukcuoglu, K., Silver, D., Graves, A., Antonoglou, I., Wierstra, D., Riedmiller, M.: Playing Atari with deep reinforcement learning. Computer Science (2013). https://arxiv.org/abs/1312.5602
Mnih, V., Kavukcuoglu, K., Silver, D., Rusu, A.A., Veness, J., Bellemare, M.G., Graves, A., Riedmiller, M., Fidjeland, A.K., Ostrovski, G., et al.: Human-level control through deep reinforcement learning. Nature 518(7540), 529 (2015)
Nicolas Heess, D.S., Teh, Y.W.: Actor-critic reinforcement learning with energy-based policies. In: European Workshop on Reinforcement Learning (2012)
Pineda, F.J.: Generalization of back-propagation to recurrent neural networks. Phys. Rev. Lett. 59(19), 2229–2232 (1987)
Riedmiller, M.: Neural fitted q iteration - first experiences with a data efficient neural reinforcement learning method. In: European Conference on Machine Learning, pp. 317–328 (2005)
Schuster, M., Paliwal, K.K.: Bidirectional recurrent neural networks. IEEE Trans. Signal Process. 45(11), 2673–2681 (1997)
Silver, D., Lever, G., Heess, N., Degris, T., Wierstra, D., Riedmiller, M.: Deterministic policy gradient algorithms. In: ICML (2014)
Silver, D., Huang, A., Maddison, C.J., Guez, A., Sifre, L., van den Driessche, G., Schrittwieser, J., Antonoglou, I., Panneershelvam, V., Lanctot, M.: Mastering the game of Go with deep neural networks and tree search. Nature 529, 484–489 (2016)
Silver, D., Schrittwieser, J., Simonyan, K., Antonoglou, I., Huang, A., Guez, A., Hubert, T., Baker, L., Lai, M., Bolton, A.: Mastering the game of go without human knowledge. Nature 550(7676), 354–359 (2017)
Sutskever, I., Vinyals, O., Le, Q.V.: Sequence to sequence learning with neural networks. In: Advances in Neural Information Processing Systems, pp. 3104–3112 (2014)
Sutton, R., Barto, A.: Reinforcement Learning: An Introduction. MIT Press, Cambridge (1998)
Sutton, R.S., McAllester, D., Singh, S., Mansour, Y.: Policy gradient methods for reinforcement learning with function approximation. In: Advances in Neural Information Processing Systems, pp. 1057–1066 (2000)
Vinyals, O., Fortunato, M., Jaitly, N.: Pointer networks. In: Advances in Neural Information Processing Systems, pp. 2692–2700 (2015)
Vinyals, O., Kaiser, Ł., Koo, T., Petrov, S., Sutskever, I., Hinton, G.: Grammar as a foreign language. In: Advances in Neural Information Processing Systems, pp. 2773–2781 (2015)
Watkins, C.H., Dayan, P., Christopher, J.: Q-learning. In: Machine Learning, pp. 279–292 (1992)
Williams, R.J., Zipser, D.: A learning algorithm for continually running fully recurrent neural networks. Neural Comput. 1(2), 270–280 (1989)
Zhang, M.-L., Zhou, Z.-H.: A review on multi-label learning algorithms. IEEE Trans. Knowl. Data Eng. 26(8), 1819–1837 (2014)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Nature Switzerland AG
About this chapter
Cite this chapter
Guo, T., Han, C., Tang, S., Ding, M. (2019). Solving Combinatorial Problems with Machine Learning Methods. In: Du, DZ., Pardalos, P., Zhang, Z. (eds) Nonlinear Combinatorial Optimization. Springer Optimization and Its Applications, vol 147. Springer, Cham. https://doi.org/10.1007/978-3-030-16194-1_9
Download citation
DOI: https://doi.org/10.1007/978-3-030-16194-1_9
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-16193-4
Online ISBN: 978-3-030-16194-1
eBook Packages: Mathematics and StatisticsMathematics and Statistics (R0)