Chapter

Machine Learning and Knowledge Discovery in Databases

Volume 6321 of the series Lecture Notes in Computer Science pp 522-537

Expectation Propagation for Bayesian Multi-task Feature Selection

  • Daniel Hernández-LobatoAffiliated withCarnegie Mellon UniversityMachine Learning Group, ICTEAM institute, Université catholique de Louvain
  • , José Miguel Hernández-LobatoAffiliated withCarnegie Mellon UniversityComputer Science Department, Universidad Autónoma de Madrid
  • , Thibault HelleputteAffiliated withCarnegie Mellon UniversityMachine Learning Group, ICTEAM institute, Université catholique de Louvain
  • , Pierre DupontAffiliated withCarnegie Mellon UniversityMachine Learning Group, ICTEAM institute, Université catholique de Louvain

* Final gross prices may vary according to local VAT.

Get Access

Abstract

In this paper we propose a Bayesian model for multi-task feature selection. This model is based on a generalized spike and slab sparse prior distribution that enforces the selection of a common subset of features across several tasks. Since exact Bayesian inference in this model is intractable, approximate inference is performed through expectation propagation (EP). EP approximates the posterior distribution of the model using a parametric probability distribution. This posterior approximation is particularly useful to identify relevant features for prediction. We focus on problems for which the number of features d is significantly larger than the number of instances for each task. We propose an efficient parametrization of the EP algorithm that offers a computational complexity linear in d. Experiments on several multi-task datasets show that the proposed model outperforms baseline approaches for single-task learning or data pooling across all tasks, as well as two state-of-the-art multi-task learning approaches. Additional experiments confirm the stability of the proposed feature selection with respect to various sub-samplings of the training data.

Keywords

Multi-task learning feature selection expectation propagation approximate Bayesian inference