Local Feature Selection for the Relevance Vector Machine Using Adaptive Kernel Learning

  • Dimitris Tzikas
  • Aristidis Likas
  • Nikolaos Galatsanos
Part of the Lecture Notes in Computer Science book series (LNCS, volume 5768)

Abstract

A Bayesian learning algorithm is presented that is based on a sparse Bayesian linear model (the Relevance Vector Machine (RVM)) and learns the parameters of the kernels during model training. The novel characteristic of the method is that it enables the introduction of parameters called ‘scaling factors’ that measure the significance of each feature. Using the Bayesian framework, a sparsity promoting prior is then imposed on the scaling factors in order to eliminate irrelevant features. Feature selection is local, because different values are estimated for the scaling factors of each kernel, therefore different features are considered significant at different regions of the input space. We present experimental results on artificial data to demonstrate the advantages of the proposed model and then we evaluate our method on several commonly used regression and classification datasets.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Tzikas, D., Likas, A., Galatsanos, N.: Sparse bayesian modeling with adaptive kernel learning. IEEE Transactions on Neural Networks (to appear) Google Scholar
  2. 2.
    Tipping, M.E.: Sparse Bayesian learning and the relevance vector machine. Journal of Machine Learning Research 1, 211–244 (2001)MathSciNetMATHGoogle Scholar
  3. 3.
    Schmolck, A., Everson, R.: Smooth relevance vector machine: a smoothness prior extension of the RVM. Machine Learning 68(2), 107–135 (2007)CrossRefGoogle Scholar
  4. 4.
    Tipping, M.E., Faul, A.: Fast marginal likelihood maximisation for sparse Bayesian models. In: Proceedings of the Ninth International Workshop on Artificial Intelligence and Statistics (2003)Google Scholar
  5. 5.
    Krishnapuram, B., Hartemink, A.J., Figueiredo, M.A.T.: A Bayesian approach to joint feature selection and classifier design. IEEE Transactions on Pattern Analysis and Machine Intelligence 26(9), 1105–1111 (2004)CrossRefGoogle Scholar
  6. 6.
    Holmes, C.C., Denison, D.G.T.: Bayesian wavelet analysis with a model complexity prior. In: Bernardo, J.M., Berger, J.O., Dawid, A.P., Smith, A.F.M. (eds.) Bayesian Statistics 6: Proceedings of the Sixth Valencia International Meeting. Oxford University Press, Oxford (1999)Google Scholar
  7. 7.
    Bishop, C.M.: Pattern Recognition and Machine Learning. Springer, Heidelberg (2006)MATHGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2009

Authors and Affiliations

  • Dimitris Tzikas
    • 1
  • Aristidis Likas
    • 1
  • Nikolaos Galatsanos
    • 2
  1. 1.Department of Computer ScienceUniversity of IoanninaIoanninaGreece
  2. 2.Department of Electrical EngineeringUniversity of PatrasRioGreece

Personalised recommendations