SVM Approximation of Value Function Contours in Target Hitting Problems
In a problem of target hitting, the capture basin at cost c is the set of states that can reach the target with a cost lower or equal than c, without breaking the viability constraints. The boundary of a c-capture basin is the c-contour of the problem value function. In this paper, we propose a new algorithm that solves target hitting problems, by iteratively approximating capture basins at successive costs. We show that, by a simple change of variables, minimising a cost may be reduced to the problem of time minimisation, and hence a recursive backward procedure can be set. Two variants of the algorithm are derived, one providing an approximation from inside (the approximation is included in the actual capture basin) and one providing a outer approximation, which allows one to assess the approximation error. We use a machine learning algorithm (as a particular case, we consider Support Vector Machines) trained on points of a grid with boolean labels, and we state the conditions on the machine learning procedure that guarantee the convergence of the approximations towards the actual capture basin when the resolution of the grid decreases to 0. Moreover, we define a control procedure which uses the set of capture basin approximations to drive a point into the target. When using the inner approximation, the procedure guarantees to hit the target, and when the resolution of the grid tends to 0, the controller tends to the optimal one (minimizing the cost to hit the target). We illustrate the method on two simple examples, Zermelo and car on the hill problems.
KeywordsViability theory Capture basin Optimal control Support vector machines
Unable to display preview. Download preview PDF.
- 1.Aubin, J.P.: Viability theory. Birkhäuser (1991)Google Scholar
- 2.Bayen, A.M., Crück, E., Tomlin, C.J.: Guaranteed Overapproximations of Unsafe Sets for Continuous and Hybrid Systems: Solving the Hamilton-Jacobi Equation Using Viability Techniques. In: Tomlin, C.J., Greenstreet, M.R. (eds.) HSCC 2002. LNCS, vol. 2289, pp. 90–104. Springer, Heidelberg (2002)CrossRefGoogle Scholar
- 4.Cardaliaguet, P., Quincampoix, M., Saint-Pierre, P.: Set-Valued Numerical Analysis for Optimal control and Differential Games. Annals of the International Society of Dynamic Games (1998)Google Scholar
- 5.Chang, C.C., Lin, C.J.: Libsvm: a library for support vector machines (2001)Google Scholar
- 8.Lhommeau, M., Jaulin, L., Hardouin, L.: Inner and outer approximation of capture basin using interval analysis. In: 4th International Conference on Informatics in Control, Automation and Robotics, ICINCO 2007 (2007)Google Scholar
- 11.Moore, A., Atkeson, C.: The parti-game algorithm for variable resolution reinforcement learning in multidimensional state-spaces. Machine Learning 21, 199–233 (1995)Google Scholar
- 13.Saint-Pierre, P.: Approche ensembliste des systèmes dynamiques, regards qualitatifs et quantitatifs. Société de Mathématiques Appliquées et Industrielles, 66 (2001)Google Scholar
- 14.Scholkopf, B., Smola, A.: Learning with Kernels: Support Vector Machines, Regularization, Optimization and Beyond. MIT Press, Cambridge (2002)Google Scholar
- 15.Vapnik, V.: The nature of statistical learning theory. Springer (1995)Google Scholar