Skip to main content

Incorporating prior knowledge in support vector regression

Abstract

This paper explores the incorporation of prior knowledge in support vector regresion by the addition of constraints. Equality and inequality constraints are studied with the corresponding types of prior knowledge that can be considered for the method. These include particular points with known values, prior knowledge on any derivative of the function either provided by a prior model or available only at some specific points and bounds on the function or any derivative in a given domain. Moreover, a new method for the simultaneous approximation of multiple outputs linked by some prior knowledge is proposed. This method also allows consideration of different types of prior knowledge on single outputs while training on multiple outputs. Synthetic examples show that incorporating a wide variety of prior knowledge becomes easy, as it leads to linear programs, and helps to improve the approximation in difficult cases. The benefits of the method are finally shown on a real-life application, the estimation of in-cylinder residual gas fraction in spark ignition engines, which is representative of numerous situations met in engineering.

References

  1. Andrews, R., & Geva, S. (1999). On the effects of initializing a neural network with prior knowledge. In Proceedings of the international conference on neural information processing, Perth, Western Australia (pp. 251–256).

  2. Bennett, K. P. (1999). Combining support vector and mathematical programming methods for classification. In B. Schölkopf, C. J. Burges, & A. J. Smola (Eds.), Advances in kernel methods: support vector learning (pp. 307–326). Cambridge: MIT Press.

    Google Scholar 

  3. Bloch, G., Lauer, F., Colin, G., & Chamaillard, Y. (2007). Combining experimental data and physical simulation models in support vector learning. In Proceedings of the 10th international conference on engineering applications of neural networks (pp. 284–295), Thessaloniki, Greece.

  4. Cristianini, N., & Shawe-Taylor, J. (2000). An introduction to support vector machines and other kernel-based learning methods. Cambridge: Cambridge University Press.

    Google Scholar 

  5. Evgeniou, T., Pontil, M., & Poggio, T. (2000). Regularization networks and support vector machines. Advances in Computational Mathematics, 13, 1–50.

    MATH  Article  MathSciNet  Google Scholar 

  6. Fung, G., Mangasarian, O. L., & Shavlik, J. W. (2002). Knowledge-based support vector machine classifiers. In S. Becker, S. Thrun, & K. Obermayer (Eds.), NIPS (pp. 521–528). Cambridge: MIT Press.

    Google Scholar 

  7. Fung, G., Mangasarian, O. L., & Shavlik, J. W. (2003). Knowledge-based nonlinear kernel classifiers. In Schölkopf, B. & Warmuth, M. K. (Eds.), Lecture notes in computer science : Vol. 2777. COLT (pp. 102–113). Berlin: Springer.

    Google Scholar 

  8. Giansetti, P., Colin, G., Higelin, P., & Chamaillard, Y. (2007). Residual gas fraction measurement and computation. International Journal of Engine Research, 8(4), 347–364.

    Article  Google Scholar 

  9. Hastie, T., Tibshirani, R., & Friedman, J. et al. (2001). The elements of statistical learning: data mining, inference, and prediction. Berlin: Springer.

    MATH  Google Scholar 

  10. Imagine (2006). Amesim web site www.amesim.com.

  11. Joachims, T. (2002). Learning to classify text using support vector machines: methods, theory and algorithms. Dordrecht: Kluwer Academic.

    Google Scholar 

  12. Johansen, T. (1996). Identification of non-linear systems using empirical data and prior knowledge-an optimization approach. Automatica, 32(3), 337–356.

    MATH  Article  MathSciNet  Google Scholar 

  13. Lauer, F., & Bloch, G. (2007, to appear). Incorporating prior knowledge in support vector machines for classification: a review. Neurocomputing.

  14. Lawson, C. L., & Hanson, R. J. (1995). Solving least squares problems, Classics in applied mathematics. Philadelphia: SIAM.

    MATH  Google Scholar 

  15. Lázaro, M., Pérez-Cruz, F., & Artés-Rodriguez, A. (2005a). Learning a function and its derivative forcing the support vector expansion. IEEE Signal Processing Letters, 12, 194–197.

    Article  Google Scholar 

  16. Lázaro, M., Santamaria, I., Pérez-Cruz, F., & Artés-Rodriguez, A. (2005b). Support vector regression for the simultaneous learning of a multivariate function and its derivatives. Neurocomputing, 69, 42–61.

    Article  Google Scholar 

  17. Maclin, R., Shavlik, J., Torrey, L., Walker, T., & Wild, E. (2005). Giving advice about preferred actions to reinforcement learners via knowledge-based kernel regression. In Proceedings of the 20th national conference on artificial intelligence, Pittsburgh, PA, USA.

  18. Mangasarian, O. (2000). Generalized support vector machines. In A. Smola, P. Bartlett, B. Schölkopf, & D. Schuurmans (Eds.), Advances in large margin classifiers (pp. 135–146). Cambridge: MIT Press.

    Google Scholar 

  19. Mangasarian, O. L., & Musicant, D. R. (2002). Large scale kernel regression via linear programming. Machine Learning, 46(1–3), 255–269.

    MATH  Article  Google Scholar 

  20. Mangasarian, O. L., Shavlik, J. W., & Wild, E. W. (2004). Knowledge-based kernel approximation. Journal of Machine Learning Research, 5, 1127–1141.

    MathSciNet  Google Scholar 

  21. Mangasarian, O. L., & Wild, E. W. (2007). Nonlinear knowledge in kernel approximation. IEEE Transactions on Neural Networks, 18, 300–306.

    Article  Google Scholar 

  22. Mattera, D., & Haykin, S. (1999). Support vector machines for dynamic reconstruction of a chaotic system. In B. Schölkopf, C. J. Burges, & A. J. Smola (Eds.), Advances in kernel methods: support vector learning (pp. 211–241). Cambridge: MIT Press.

    Google Scholar 

  23. Micchelli, C., & Utreras, F. (1988). Smoothing and interpolation in a convex subset of a Hilbert space. SIAM Journal on Scientific and Statistical Computing, 9, 728.

    MATH  Article  MathSciNet  Google Scholar 

  24. Müller, K., Smola, A., Rätsch, G., Schölkopf, B., Kohlmorgen, J., & Vapnik, V. (1997). Predicting time series with support vector machines. In Proceedings of the international conference on artificial neural networks (pp. 999–1004).

  25. Narendra, K. S., & Parthasarathy, K. (1990). Identification and control of dynamical systems using neural networks. IEEE Transactions on Neural Networks, 1(1), 4–27.

    Article  Google Scholar 

  26. Poggio, T., & Vetter, T. (1992). Recognition and structure from one 2D model view: observations on prototypes, object classes and symmetries (Technical Report AIM-1347). Massachusetts Institute of Technology, Cambridge, MA, USA.

  27. Sánchez-Fernández, M., De Prado-Cumplido, M., Arenas-García, J., & Pérez-Cruz, F. (2004). SVM multiregression for nonlinear channel estimation in multiple-input multiple-output systems. IEEE Transactions on Signal Processing, 52(8), 2298–2307.

    Article  MathSciNet  Google Scholar 

  28. Schölkopf, B., Burges, C., & Vapnik, V. (1996). Incorporating invariances in support vector learning machines. In C. von der Malsburg, W. von Seelen J. C. Vorbrüggen, & B. Sendhoff (Eds.), Lecture notes in computer science : Vol. 1112. ICANN (pp. 47–52). Berlin: Springer.

    Google Scholar 

  29. Smola, A. J., & Schölkopf, B. (2004). A tutorial on support vector regression. Statistics and Computing, 14(3), 199–222.

    Article  MathSciNet  Google Scholar 

  30. Smola, A. J., Schölkopf, B., & Müller, K. R. (1998). The connection between regularization operators and support vector kernels. Neural Networks, 11(4), 637–649.

    Article  Google Scholar 

  31. Smola, A. J., Friess, T., & Schölkopf, B. (1999a). Semiparametric support vector and linear programming machines. Advances in neural information processing systems (vol. 11, pp. 585–591). Cambridge: MIT Press.

    Google Scholar 

  32. Smola, A. J., Schölkopf, B., & Rätsch, G. (1999b). Linear programs for automatic accuracy control in regression. In Proceedings of the 9th international conference on artificial neural networks (vol. 2, pp. 575–580) Edinburgh, UK.

  33. Söderström, T., & Stoica, P. (1988). System identification. Upper Saddle River: Prentice-Hall.

    Google Scholar 

  34. Stitson, M. O., Gammerman, A., Vapnik, V., Vovk, V., Watkins, C., & Weston, J. (1999). Support vector regression with ANOVA decomposition kernels. In B. Schölkopf, C. J. Burges, & A. J. Smola (Eds.), Advances in kernel methods: support vector learning (pp. 285–291). Cambridge: MIT Press.

    Google Scholar 

  35. Tay, F., & Cao, L. (2002). Modified support vector machines in financial time series forecasting. Neurocomputing, 48, 847–861.

    MATH  Article  Google Scholar 

  36. Towell, G. G., & Shavlik, J. W. (1994). Knowledge-based artificial neural networks. Artificial Intelligence, 70(1–2), 119–165.

    MATH  Article  Google Scholar 

  37. Vapnik, V. N. (1995). The nature of statistical learning theory. New York: Springer.

    MATH  Google Scholar 

  38. Villalobos, M., & Wahba, G. (1987). Inequality-constrained multivariate smoothing splines with application to the estimation of posterior probabilities. Journal of the American Statistical Association, 82(397), 239–248.

    MATH  Article  MathSciNet  Google Scholar 

  39. Weston, J., Chapelle, O., Elisseeff, A., Scholkopf, B., & Vapnik, V. (2003). Kernel dependency estimation. Advances in neural information processing systems (Vol. 15), pp. 873–880.

  40. Weston, J., Gammerman, A., Stitson, M. O., Vapnik, V., Vovk, V., & Watkins, C. (1999). Support vector density estimation. In Schölkopf, B., Burges, C. J. & Smola, A. J. (Eds.), Advances in kernel methods: support vector learning (pp. 293–305). Cambridge: MIT Press.

    Google Scholar 

  41. Wu, X., & Srihari, R. (2004). Incorporating prior knowledge with weighted margin support vector machines. In Proceedings of the tenth ACM SIGKDD international conference on knowledge discovery and data mining (pp. 326–333), Seatle WA, USA. New York: ACM Press.

    Chapter  Google Scholar 

Download references

Author information

Affiliations

Authors

Corresponding author

Correspondence to Fabien Lauer.

Additional information

Editor: Dale Schuurmans.

Rights and permissions

Reprints and Permissions

About this article

Cite this article

Lauer, F., Bloch, G. Incorporating prior knowledge in support vector regression. Mach Learn 70, 89–118 (2008). https://doi.org/10.1007/s10994-007-5035-5

Download citation

Keywords

  • Support Vector Regression
  • Kernel approximation
  • Prior knowledge
  • Multi-outputs