This paper explores the incorporation of prior knowledge in support vector regresion by the addition of constraints. Equality and inequality constraints are studied with the corresponding types of prior knowledge that can be considered for the method. These include particular points with known values, prior knowledge on any derivative of the function either provided by a prior model or available only at some specific points and bounds on the function or any derivative in a given domain. Moreover, a new method for the simultaneous approximation of multiple outputs linked by some prior knowledge is proposed. This method also allows consideration of different types of prior knowledge on single outputs while training on multiple outputs. Synthetic examples show that incorporating a wide variety of prior knowledge becomes easy, as it leads to linear programs, and helps to improve the approximation in difficult cases. The benefits of the method are finally shown on a real-life application, the estimation of in-cylinder residual gas fraction in spark ignition engines, which is representative of numerous situations met in engineering.
Andrews, R., & Geva, S. (1999). On the effects of initializing a neural network with prior knowledge. In Proceedings of the international conference on neural information processing, Perth, Western Australia (pp. 251–256).
Bennett, K. P. (1999). Combining support vector and mathematical programming methods for classification. In B. Schölkopf, C. J. Burges, & A. J. Smola (Eds.), Advances in kernel methods: support vector learning (pp. 307–326). Cambridge: MIT Press.
Bloch, G., Lauer, F., Colin, G., & Chamaillard, Y. (2007). Combining experimental data and physical simulation models in support vector learning. In Proceedings of the 10th international conference on engineering applications of neural networks (pp. 284–295), Thessaloniki, Greece.
Cristianini, N., & Shawe-Taylor, J. (2000). An introduction to support vector machines and other kernel-based learning methods. Cambridge: Cambridge University Press.
Evgeniou, T., Pontil, M., & Poggio, T. (2000). Regularization networks and support vector machines. Advances in Computational Mathematics, 13, 1–50.
Fung, G., Mangasarian, O. L., & Shavlik, J. W. (2002). Knowledge-based support vector machine classifiers. In S. Becker, S. Thrun, & K. Obermayer (Eds.), NIPS (pp. 521–528). Cambridge: MIT Press.
Fung, G., Mangasarian, O. L., & Shavlik, J. W. (2003). Knowledge-based nonlinear kernel classifiers. In Schölkopf, B. & Warmuth, M. K. (Eds.), Lecture notes in computer science : Vol. 2777. COLT (pp. 102–113). Berlin: Springer.
Giansetti, P., Colin, G., Higelin, P., & Chamaillard, Y. (2007). Residual gas fraction measurement and computation. International Journal of Engine Research, 8(4), 347–364.
Hastie, T., Tibshirani, R., & Friedman, J. et al. (2001). The elements of statistical learning: data mining, inference, and prediction. Berlin: Springer.
Imagine (2006). Amesim web site www.amesim.com.
Joachims, T. (2002). Learning to classify text using support vector machines: methods, theory and algorithms. Dordrecht: Kluwer Academic.
Johansen, T. (1996). Identification of non-linear systems using empirical data and prior knowledge-an optimization approach. Automatica, 32(3), 337–356.
Lauer, F., & Bloch, G. (2007, to appear). Incorporating prior knowledge in support vector machines for classification: a review. Neurocomputing.
Lawson, C. L., & Hanson, R. J. (1995). Solving least squares problems, Classics in applied mathematics. Philadelphia: SIAM.
Lázaro, M., Pérez-Cruz, F., & Artés-Rodriguez, A. (2005a). Learning a function and its derivative forcing the support vector expansion. IEEE Signal Processing Letters, 12, 194–197.
Lázaro, M., Santamaria, I., Pérez-Cruz, F., & Artés-Rodriguez, A. (2005b). Support vector regression for the simultaneous learning of a multivariate function and its derivatives. Neurocomputing, 69, 42–61.
Maclin, R., Shavlik, J., Torrey, L., Walker, T., & Wild, E. (2005). Giving advice about preferred actions to reinforcement learners via knowledge-based kernel regression. In Proceedings of the 20th national conference on artificial intelligence, Pittsburgh, PA, USA.
Mangasarian, O. (2000). Generalized support vector machines. In A. Smola, P. Bartlett, B. Schölkopf, & D. Schuurmans (Eds.), Advances in large margin classifiers (pp. 135–146). Cambridge: MIT Press.
Mangasarian, O. L., & Musicant, D. R. (2002). Large scale kernel regression via linear programming. Machine Learning, 46(1–3), 255–269.
Mangasarian, O. L., Shavlik, J. W., & Wild, E. W. (2004). Knowledge-based kernel approximation. Journal of Machine Learning Research, 5, 1127–1141.
Mangasarian, O. L., & Wild, E. W. (2007). Nonlinear knowledge in kernel approximation. IEEE Transactions on Neural Networks, 18, 300–306.
Mattera, D., & Haykin, S. (1999). Support vector machines for dynamic reconstruction of a chaotic system. In B. Schölkopf, C. J. Burges, & A. J. Smola (Eds.), Advances in kernel methods: support vector learning (pp. 211–241). Cambridge: MIT Press.
Micchelli, C., & Utreras, F. (1988). Smoothing and interpolation in a convex subset of a Hilbert space. SIAM Journal on Scientific and Statistical Computing, 9, 728.
Müller, K., Smola, A., Rätsch, G., Schölkopf, B., Kohlmorgen, J., & Vapnik, V. (1997). Predicting time series with support vector machines. In Proceedings of the international conference on artificial neural networks (pp. 999–1004).
Narendra, K. S., & Parthasarathy, K. (1990). Identification and control of dynamical systems using neural networks. IEEE Transactions on Neural Networks, 1(1), 4–27.
Poggio, T., & Vetter, T. (1992). Recognition and structure from one 2D model view: observations on prototypes, object classes and symmetries (Technical Report AIM-1347). Massachusetts Institute of Technology, Cambridge, MA, USA.
Sánchez-Fernández, M., De Prado-Cumplido, M., Arenas-García, J., & Pérez-Cruz, F. (2004). SVM multiregression for nonlinear channel estimation in multiple-input multiple-output systems. IEEE Transactions on Signal Processing, 52(8), 2298–2307.
Schölkopf, B., Burges, C., & Vapnik, V. (1996). Incorporating invariances in support vector learning machines. In C. von der Malsburg, W. von Seelen J. C. Vorbrüggen, & B. Sendhoff (Eds.), Lecture notes in computer science : Vol. 1112. ICANN (pp. 47–52). Berlin: Springer.
Smola, A. J., & Schölkopf, B. (2004). A tutorial on support vector regression. Statistics and Computing, 14(3), 199–222.
Smola, A. J., Schölkopf, B., & Müller, K. R. (1998). The connection between regularization operators and support vector kernels. Neural Networks, 11(4), 637–649.
Smola, A. J., Friess, T., & Schölkopf, B. (1999a). Semiparametric support vector and linear programming machines. Advances in neural information processing systems (vol. 11, pp. 585–591). Cambridge: MIT Press.
Smola, A. J., Schölkopf, B., & Rätsch, G. (1999b). Linear programs for automatic accuracy control in regression. In Proceedings of the 9th international conference on artificial neural networks (vol. 2, pp. 575–580) Edinburgh, UK.
Söderström, T., & Stoica, P. (1988). System identification. Upper Saddle River: Prentice-Hall.
Stitson, M. O., Gammerman, A., Vapnik, V., Vovk, V., Watkins, C., & Weston, J. (1999). Support vector regression with ANOVA decomposition kernels. In B. Schölkopf, C. J. Burges, & A. J. Smola (Eds.), Advances in kernel methods: support vector learning (pp. 285–291). Cambridge: MIT Press.
Tay, F., & Cao, L. (2002). Modified support vector machines in financial time series forecasting. Neurocomputing, 48, 847–861.
Towell, G. G., & Shavlik, J. W. (1994). Knowledge-based artificial neural networks. Artificial Intelligence, 70(1–2), 119–165.
Vapnik, V. N. (1995). The nature of statistical learning theory. New York: Springer.
Villalobos, M., & Wahba, G. (1987). Inequality-constrained multivariate smoothing splines with application to the estimation of posterior probabilities. Journal of the American Statistical Association, 82(397), 239–248.
Weston, J., Chapelle, O., Elisseeff, A., Scholkopf, B., & Vapnik, V. (2003). Kernel dependency estimation. Advances in neural information processing systems (Vol. 15), pp. 873–880.
Weston, J., Gammerman, A., Stitson, M. O., Vapnik, V., Vovk, V., & Watkins, C. (1999). Support vector density estimation. In Schölkopf, B., Burges, C. J. & Smola, A. J. (Eds.), Advances in kernel methods: support vector learning (pp. 293–305). Cambridge: MIT Press.
Wu, X., & Srihari, R. (2004). Incorporating prior knowledge with weighted margin support vector machines. In Proceedings of the tenth ACM SIGKDD international conference on knowledge discovery and data mining (pp. 326–333), Seatle WA, USA. New York: ACM Press.
Editor: Dale Schuurmans.
About this article
Cite this article
Lauer, F., Bloch, G. Incorporating prior knowledge in support vector regression. Mach Learn 70, 89–118 (2008). https://doi.org/10.1007/s10994-007-5035-5
- Support Vector Regression
- Kernel approximation
- Prior knowledge