Sparse Bayesian Regression for Grouped Variables in Generalized Linear Models
A fully Bayesian framework for sparse regression in generalized linear models is introduced. Assuming that a natural group structure exists on the domain of predictor variables, sparsity conditions are applied to these variable groups in order to be able to explain the observations with simple and interpretable models. We introduce a general family of distributions which imposes a flexible amount of sparsity on variable groups. This model overcomes the problems associated with insufficient sparsity of traditional selection methods in high-dimensional spaces. The fully Bayesian inference mechanism allows us to quantify the uncertainty in the regression coefficient estimates. The general nature of the framework makes it applicable to a wide variety of generalized linear models with minimal modifications.An efficient MCMC algorithm is presented to sample from the posterior. Simulated experiments validate the strength of this new class of sparse regression models. When applied to the problem of splice site prediction on DNA sequence data, the method identifies key interaction terms of sequence positions which help in identifying “true” splice sites.
Unable to display preview. Download preview PDF.