General solutions for information-based and Bayesian approaches to model selection in linear regression and their equivalence
In the paper we consider the problem of model selection for linear regression within Bayesian and information-based frameworks. For both cases we generalize known approaches (evidence-based and Akaike information criterion) and derive criterion functions in terms of (in general case non-factorial) weight priors which are assumed to be Gaussian. Optimization of these criterion functions leads to two semidefinite optimization problems which can be solved analytically. We present a method that finds best priors in both approaches and show their equivalence. Surprisingly it appears that optimal prior has rank one covariance matrix. We derive explicit condition of degenerative decision rule, i.e., regression with all weights equal to zero. We conclude with experiments that show that the proposed approach significantly reduces the time needed for model selection in comparison with alternatives based on cross-validation and iterative evidence maximization while keeping generalization ability
KeywordsAkaike Information Criterion Relevance Vector Machine BAYESIAN Model Selection Regularization Matrix Regularization Coefficient
Unable to display preview. Download preview PDF.
- 2.H. Akaike, “A New Look at Statistical Model Identification,” IEEE Trans. Automatic Control 25, 461–464 (1974).Google Scholar
- 3.C. M. Bishop, Pattern Recognition and Machine Learning (Springer, 2006).Google Scholar
- 8.M. E. Tipping, “The Relevance Vector Machine,” Adv. Neural Inform. Proc. Systems 12, 652–658 (2000).Google Scholar
- 10.D. A. Kropotov and D. P. Vetrov, “On One Method of Non-Diagonal Regularization in Sparse Bayesian Learning,” in Proceedings of the 24th Annual International Conference on Machine Learning (ICML 2007), 2007, pp. 457–464.Google Scholar
- 11.D. Wipf and S. Nagarajan, “A New View of Automatic Relevance Determination,” Adv. Neural Inform. Proc. Systems 20, 1625–1632 (2007).Google Scholar
- 12.A. Asuncion and D. J. Newman, UCI Machine Learning Repository, Univ. California, Irvine, School of Information and Computer Sciences, http://archive.ics.uci.edu/ml 2007.
- 13.D. Kropotov, N. Ptashko, and D. Vetrov, “The Application of Continuous Akaike Information Criterion for Automatic Selection of Relevant Regressors,” in Proc. of 9th International Conference “Pattern Recognition and Image Analysis: New Information Technologies” (Nizhni Novgorod, 2008), Vol. 1, pp. 423–426.Google Scholar